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AN EXAMPLE OF THE USE OF EXTENDED CROSS-OV ER 
DESIGNS IN THE COMPARISON OF NPH 
INSULIN MIXTURES 


JosepH L. CIMINERA AND Extwoop K. Wo.re* 


Research and Quality Control Divisions 
Sharp & Dohme Division of Merck & Company, Inc. 
West Point, Pennsylvania 


INTRODUCTION 


When a solution of protamine and a solution of zinc insulin are 
combined, a precipitate of protamine zinc insulin is produced. Sus- 
pensions of this type have been employed in the management of diabetes 
and have advantage over unmodified insulin in that they control the 
patient’s blood sugar level over a relatively long period of time. 
Recently, a new type of long acting insulin, designated as NPH Insulin, 
has become available. NPH Insulin is a suspension of protamine zinc 
insulin such that the protamine content will not be less than, nor more 
than ten percent greater than, the quantity required for the isophane- 
ratio. The isophane-ratio is that ratio of insulin to protamine which 
results in equivalent amounts of insulin and protamine remaining in the 
supernatant as tested by nephelometric procedures. We were interested 
in determining whether it would be possible to detect a biological dif- 
ference in a preparation in which the protamine content was five percent 
less than the quantity required by the isophane-ratio. 

NPH Insulin is prepared from insulin which has been previously 
assayed and approved by two independent laboratories, as required by 
the Food and Drug Administration. For this reason, it is not anticipated _ 
that routine biological assays will be required to control the normal 
production of NPH Insulin. 


oe a 


EXPERIMENTAL METHOD 


Two NPH Insulin mixtures were prepared; Mixture A (the -““‘stan- —— 


dard”) was prepared to contain the isophane-ratio of protamine to 
insulin, Mixture B (the “unknown’’) was prepared to contain five percent 


ee 
*Present address: Camp Detrick, Frederick, Maryland 
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less protamine than Mixture A. Each mixture contained a concentra- 
tion of insulin equivalent to 40 units per ml.* 

The experimental procedure employed was a minor modification of 
the simple cross-over assay for Globin Zinc Insulin Injection, official 
in the U.S. Pharmacopoeia XIV (1). This modification consisted only of 
limiting the observation period to six instead of nine hours and with 
four post-injection bleeding times equally spaced within the six-hour 
period of observation. The real difference between the two mixtures was 
expected to be small. For this reason, it was planned beforehand to 
extend the cross-over to include both a switchback and a double switch- 
back design in an attempt to obtain more evidence upon which to base a 
decision. All three designs were analyzed separately and compared. 

Twenty-two female rabbits, weighing from 2.5 to 3.5 kg, were dis- 
tributed randomly between two equal groups. At weekly intervals, 
according to the scheme shown in Table I, single subcutaneous injections 
of either Mixture A or B were administered in volumes of 0.051 ml as 
measured from a micrometer syringe. 


TABLE I 
INJECTION SCHEDULE 


Period (Date) 


Group —_——_—— 
1 (6-27-50) 2 (7-3-50) 3 (7-10-50) 4 (7-17-50) 
I A B A B 
II B A B A 


Blood samples were obtained from each rabbit at 0 (initial level), 

5, 3.0, 4.5, and 6.0 hours after injection. Blood sugar levels were 

determined for each sample. The bleeding times were spaced equally to 

facilitate computation of the results. The raw results for all animals 
_ at all time periods are presented in Table II. 


COMPUTATIONAL PROCEDURE 


Brandt (2) has thoroughly discussed the analysis of cross-over 
designs. Our data, however, include an additional sub-unit, bleeding 
times, and require an extension of Brandt’s methods. Although Brandt 
discusses the use of covariance, he does not consider the case where 


the concomitant variable is common to all values in the sub-unit. In 
ee Ege ee ee 

*We are indebted to Dr. Robert J. Westfall, Biochemical Research Dept., Sharp & Dohme, for 
the preparation of these mixtures. The technical assistance of Messrs. J. J. Hogue, H. Maxwell, and 
S. H. Hunter in the performance of the assays is greatly appreciated. 
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this experiment, the initial blood sugar values are necessarily common 
to the blood sugar values for all other bleeding times. 

The physiological relation between time and blood sugar levels, 
following insulin injection, is quite complex and is not adequately 
described by any simple function. Since the objective of the experiment 
was to determine the effect of a difference in protamine content, it was 
felt that the customary pharmacopoeial procedure, which specifies 
that the results be analyzed in terms of the differences in blood sugar 
levels between the two mixtures, should be adopted. The unit selected 
for analysis, therefore, was the difference in blood sugar levels pro- 
duced by the two mixtures in the same group of animals at the various 
test periods. Brandt (2) has shown that in a design of this type the 
blood sugar differences are confounded with the periods X groups inter- 
action when a simple cross-over (2 test periods) is involved, with the 
quadratic component of periods X groups when three test periods are 
involved and with the cubic component of periods X groups when there 
are four test periods. For all but two test periods, where only a simple 
subtraction of Period 1 — Period 2 is necessary, the differences are 
most easily computed by means of polynomial coefficients, as given 
in Table VI and illustrated by example later. 

For the purpose of this study, the only relevant factors that need 
be considered are the nature of the differences between preparations 
and the possible influence of the initial blood sugar values upon sub- 
sequent readings. 


A. Analysis of Two-Period Differences 


Table III lists under “hours after injection” the difference in blood 
sugar level between the first and second test periods at each bleeding 
time. These are readily computed from the raw values in Table II. 
Thus, for rabbit number 1, the difference at three hours after injection is, 


P, — P, = 35 — 52 = —17 


The “totals” column represents the algebraic sum of the differences 
at 1.5, 3.0, 4.5, and 6.0 hours after injection. Thus, for rabbit number 2, 


4— 17 — 25 — 30 = —68 


_ The remaining columns list the linear, quadratic, and cubic components 
of the blood sugar differences for each animal. These were computed as 
the sums of the products of the differences at each bleeding time and the 
corresponding orthogonal polynomial coefficients for n’ = 4 equally 
spaced bleeding times. The latter may be obtained from Fisher and 
Yates (3) and are reproduced for convenience in Table IV. 
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TABLE III 
DIFFERENCES AND TERMS FOR TWO-PERIOD ANALYSIS 


Rab- Hours after Injection Sums of Products (y) 
bit = | Totals ; 1 
Now) 0 = = 1.5 3.0 4.5 6.0 (y) Linear | Quad- | Cubic 
ratic 
1 —13 5 —17 —12 —26 —50 —88 8 —46 
2 —8 4 —17 —25 —30 —68 | —110 16 —10 
3 —13 13 34 —8 4 43 —69 —9 117 
+t —26 9 —25 —12 —34 —62 | —116 12 —82 
5 —4 13 —8§ —17 —26 —38 | —126 12 —12 
6 13 0 —8 17 31 40 | —118 22 —44 
7 0 21 —12 —5 —9 —5 —83 29 —5l1 
8 —4 27 —8 0 —21 —2 | —1386 14 —72 
9 9 25 0 —30 —34 —39 | —207 Pill 31 
10 9 9 —23 —4 —17 —35 —59 19 —83 
11 —9 12 —12 —5 —17 —22 —80 12 —50 
Sub- 
totals —46 138 —96 | —101 | —179 | —238 | —956 156 | —302 
12 13 —9 —8 0 —26 —43 —43 —27 —41 
13 —9 16 13 17 25 a 31 11 —3 
14 —5 0 —22 21 0 —1 43 1} —129 
15 0 —4 —5 25 4 20 54 —20 —82 
16 0 —12 —17 8 —4 —25 49 —7 —67 
1/ 0 21 8 —4 —43 —18 | —204 —26 —28 
18 0 9 5 —12 —38 —36 | —158 —22 4 
19 0 10 —10 —8 0 —8 —28 28 —16 
20 —5 () —4 0 4 0 16 8 —8 
21 0 8 —6 34 47 83 157 27 —81 
22 0 13 5 13 4 35 —19 —1 —33 
Sub- 
totals —6 52 —41 94 —27 78 | —102 —28 | —484 
Differ- 
ences 
be- 
tween 
Sub- 
totals —40 86 —55 | —195 | —152 | —316 | —854 184 182 


As an example, the sum of products for the linear component for rabbit 
number 3 would be derived from Tables III and IV as follows: 


(13)(—3) + (84)(—1) + (—8)(+1) + @(+3) = —69 
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TABLE IV 
ORTHOGONAL POLYNOMIAL COEFFICIENTS FOR n’ = 4 
Component Coefficients 
Linear —3 —] +1 | +3 
Quadratic +1 —1 —1 +1 
Cubic —1 +3 —3 +1 


The differences between the sub-totals as shown in the last row of 
Table III represent the differences between the two insulin mixtures 
(Mixture A — Mixture B). The difference between the sub-totals in 
the ‘totals’? column is a measure of the difference in the mean blood sugar 
levels for the two mixtures, while the differences between the sub-totals 
for the “sums of products’ columns characterize the nature of the 
difference over the four bleeding times, or, in other words, serves as a 
means of comparing the blood sugar curves produced by the two insulin 
mixtures. This will be elaborated on further in the section on inter- 
pretation of results. 

There now remains the problem of testing the mixture differences 
for significance and of determining the effect, if any, of the initial 
blood sugar levels upon the post-injection observations. The present 
official U.S. Pharmacopoeia (1) assays of long acting insulin prepara- 
tions take no account of a possible effect of the initial blood sugar 
level. Earlier investigators (4, 5), however, have shown that this may 
be of considerable importance. Covariance analysis with the initial 
blood sugar level as the concomitant variable was employed in this 
investigation. The analysis is conveniently laid out in the form shown 
in Table V. There are four sources of variation (differences) relevant 
to the mixture comparison, each with a sum of squares representing the 
difference between mixtures and an error term with 19 degrees of 
freedom, after adjusting for covariance. The analysis is conducted 
on a whole-unit basis. 

The sums of squares and products in all four sections are obtained 
from the terms in the second and the last four columns of Table III, 
labeled (x) and (y), respectively. The method of computation is the 
same for all four sections and is illustrated below for the linear term: 


[x"] 
Mixtures = (—40)?/44 = 36.36 


where 44 = (2) (22) = sum of the squares of the coefficients for obtaining 
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the two-period differences times the number of rabbits. The former may 
be obtained directly from Table VI. 


(=13)F (0) al 46) ad 
2 


I 


Error 


I 


813.18 
where 11 = number of rabbits per group 


2 = sum of the squares of the coefficients for obtaining the two- 
period differences and may be obtained directly from Table VI. 


[xy] 
Mixtures = (—40)(—854) /44 = 776.36 


Error 


_ ($13) 88) 4+: FO) = 19) — (4) 956) — 0) 1027 
‘¢ 2 


= —704.73 


Mixtures = (—854)?/44 = 16575.36 


(3.88) = ee (= 19) = (Oa Ga 
2 


Error = 


82275.55 


_ The residual total sum of squares and that for error are obtained in 
the usual way from 


[yee ay fy leat 
and the adjusted sum of squares for mixtures is obtained as the difference 
of these two residuals. 
The interpretation of this analysis and that for three and four test 
periods is deferred to the section on interpretation of the results. 


The coefficients for obtaining the required “differences” for two, 


three, and four test periods and their sums of squares are shown in 
Table VI. 


B. Analysis for Three-Period Differences 

Tables VII and VIII show the required terms and the analysis _ 
three-period ‘differences’. 

The “differences” is columns 2 to 6 are onan from the values in 
Table II and the coefficients for three test periods in Table VI. Thus, 
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TABLE VI 
COEFFICIENTS FOR OBTAINING “DIFFERENCES” 


Test Sums of 
Periods Coefficients Squares 
2 +1, —1 2 
3 ie ot 6 
4* feo ees 20 


*The coefficients for four test periods normally would be —1, +3, —3, +1. The signs were re- 
versed in order to make the results comparable with the two and three test period results. 


for Rabbit 1, at 1.5 hours after injection, the required ‘‘difference’”’ is 
obtained as follows: 


(+1)(52) + (—2)(47) + (+1)(52) = 10 


The other columns are computed in the same manner as for Table III. 

The method of computation for the analysis of variance is identical 
with that for Table V, except for a change in the divisors. The divisor 
for mixtures becomes 132 (=6 X 22), while the divisor for error be- 
comes 6. The latter is obtained from Table VI. 


C. Analysis for Four-Period Differences 
Tables IX and X show the required terms and the analysis for four- 
period “differences”. Computational procedure is as described above. 


INTERPRETATION OF RESULTS 


The mean difference between the two insulin mixtures for the two- 
period analysis is the simple difference between observations in the 
first and second periods. Using subscripts to denote the period, we have 


Aa? = mean difference for Group I 

Bak: = mean difference for Group II 

Age B ye Pu Ao difference of means 
lal all ; 


Ag Back ty 
11 
aA Aa B, + B, 


Lt Va 


440 BIOMETRICS, DECEMBER 1953 
TABLE VII 
DIFFERENCES AND TERMS FOR THREE-PERIOD ANALYSIS 
Rab- Hours after Injection | Sums of Products (y) 
bit — Totals 
INOW Oe—ara eee 3.0 AD 6.0 (y) | Linear | Quad- | Cubic 
ratic 
1 —18 10 —34 —41 —56 | —121 | —205 29 —45 
2 10 | —22 -—9| -—76/ —94| —201 | —283| —831 129 
3 —4 De 43 18 34 117 11 —5 87 
4 —56 —12 —16 5 —43 —66 —72 —44 —94 
5 —8 13 —50 —59 —52 | —148 | —204 7 —38 
6 8 9 18 iayil 57 135 eee —3 —51 
7 —9 38 —58 —27 —60 | —107 | —268 63 | —191 
8 5 58 —16 13 30 85 —55 91 | —115 
9 Dy 42 —21 —43 —34 —56 | —250 | ie —10 
10 9 39 2 47 1 89 —69 —9 | —173 
1l 2 29 —33 —1 —17 —22 | —106 46 | —142 
Sub- 
totals —56 226 | —174 | —113 | —234 | —295 |—1319 279 | —643 
12 26 —18 —12 17 —17 —30 32 —40 —86 
13 —18 7 —8 38 ie, 109 241 49 —73 
14 —22 —13 —40 34 4 —15 125 —3 | —205 
15 5 —31 —14 58 39 52 282 —36 | —146 
16 5 —16 —38 —5 —4 —63 69 23 —87 
le —13 21 —13 —8 —48 —48 | —202 —6 —84 
18 —13 22 39 340 —16 67 | —131 —55 13 
19 —13 10 —24 —30 —38 —82 | —150 26 —30 
20 —10 9 —8 4 4 9 — 3 ve —41 
21 0 4 —19 34 60 79 221 49 | —103 
22 —4 9 1 47 22 79 85 17} =125 
Sub- 
totals —57 4 | —1386 211 78 157 569 7 | —967 
Differ- 
ences 1 222 —38 | —324 | —312 | —452 |—1888 272 324 


The average differences at each of the four bleeding times after 
injection are obtained, therefore, from the differences in the last row 
of Table IIT after division by 22 (= 11 xX 2). 

Similarly, it can be shown that the weighted difference of means io 
the three-period analysis is 


A,+2A,+ As; 


11 


_ By +2B, + B, 
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TABLE IX 
DIFFERENCES AND TERMS FOR FOUR-PERIOD ANALYSIS 
Rab- Hours after Injection Sums of Products (y) 
Dit. ————_—— — ——| Totals 
No. | 0! =a 1h es 3.0 4.5 6.0 (y) Linear | Quad- | Cubic 
ratic 
1 —32 7 —76 | —108 | —120 | —297 | —413 fall —31 
2 24 —82 —1 | —187 | —222 | —492 | —606 | —116 418 
3 1 10 31 27 55 123 131 7 57 
4 —125 —59 23 18 |} —108 | —126 | —152 | —208 —34 
5 —16 17 | —121 | —147 | —113 | —364 | —416 172 —52 
6 — 23 1 15 59 66 141 239 —7 —67 
7 —39 42 | —129 | —109 | —183 | —379 | —655 97 | —285 
8 27 110 —28 47 153 282 204 244 | —182 
9 —16 63 —63 —73 —30 | —103 | —289 169 —63 
10 0 103 35 94 32 264 | —154 6 | —248 
11 21 46 —59 —] —21 —35 | —143 85 | —241 
Sub- 
totals | —178 258 | —373 | —380 | —491 | —986 |—2254 520 | —728 
12 31 —69 —42 13 —8 | —106 238 —48 | —104 
13 —53 —11} —106 24 102 9 469 173 | —277 
14 —60 —56 —93 30 10 | —109 321 17 | —3038 
15 8 —98 —53 77 66 —8 622 —56 | —226 
16 15 —29 —76 —3l1 —22 | —158 66 56 | —128 
17 —43 9 —59 —33 —60 | —143 | —181 41 | —147 
18 —60 31 69 30 —6 124 | —150 —74 80 
19 —43 —4 —42 —62 —76 | —184 | —236 24 —12 
20 —35 —11 —54 —17 13 —69 109 73 —87 
21 —5 -—8 —41 —4 56 3 229 93 —47 
OP) —21 —20 —12 90 45 103 297 —53 | —241 
Sub- 
totals | —266 | —266 | —509 117 120 | —588 | 1784 246 |—1492 
Differ- 
ences 88 524 136 | —497 | —611 | —448 |—4038 274 764 


and the average differences are obtained from the differences in the last 
row of Table VII after division by 44 (= 11 x4). 


Finally, the weighted difference of means for the four-period analysis 

is 

A, +34,+34,+ A,  B,+3B, + 3B, + B, 
ei il 
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and the average differences are obtained from the differences in the last 
row of Table [X after division by 88 (= 11 X 8). 

The average differences between the two insulin mixtures at the four 
bleeding times after injection and after two, three, and four periods are 
given in Table XI and shown graphically in Figure I. 


TABLE XI 
WEIGHTED AVERAGE DIFFERENCES (A MINUS B) IN MG. PERCENT 
OF BLOOD SUGAR 


Number of Test Periods 
Hours After Injection = |————___— 
2 3 4 
1.5 3.9 5.0 6.0 
3.0 —2.5 —0.9 1.5 
4.5 —8.9 —7.4 —5.6 
6.0 —6.9 —7.1 —6.9 


AVERAGE WEIGHTED DIFFERENCE 
(MIXTURE A MINUS MIXTURE B) 
° 


-10 s— ——— ——— 
LS 3.0 4.5 6.0 
HOURS AFTER INJECTION 


FIGURE I. CURVES OF AVERAGE WEIGHTED DIF- 


FERENCES AFTER TWO TEST PERIODS (——-—), 
THREE TEST PERIODS (....), AND FOUR TEST 
PERIODS (¢ Me 


In the two-period and three-period tests Mixture B (which was 
prepared to contain five percent less protamine) gave lower blood sugar 
values than Mixture A at one and one-half hours after injection, but 
thereafter permitted a more rapid recovery to normal blood sugar 
levels. In the four-period test, more rapid recovery occurred four and 
one-half hours after injection. Apparently, the five percent decrease 
in protamine content of Mixture B resulted in a more drastic initial 
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reduction of blood sugar and a less prolonged effect, which is what 
would be expected on the basis of experience with protamine-insulin 
mixtures. 

2 Shes as, ; 

The F-ratios obtained after two, three, and four periods are sum- 
marized in Table XII. 


TABLE XII 
F-RATIOS AFTER TWO, THREE, AND FOUR TEST PERIODS 


Number of Test Periods 
Source of Variation 


2 3 4 
Totals 2.03 1.00 0.26 
Linear 4.00 7.02* 9.15** 
Quadratic 5.95* 1.89 0.17 
Cubic 0.53 0.71 0.90 


*Significant (P < 0.05) 
**Significant (P < 0.01) 


The “Totals” is a measure of the mean differences between the two 
insulin mixtures averaged over all four bleeding times after injection. 
That none of the ‘‘Totals” are significant was not surprising since the 
initial differences were positive and the subsequent differences negative. 
In general, varying the protamine content of a long acting insulin 
mixture does not displace the blood sugar curve, but simply alters its 
shape. Because of this, mean differences are often of little value in 
comparing insulin preparations, whereas differences in the components 
of regression are of the greatest importance. 

The linear, quadratic and cubic terms under ‘Source of Variation’ 
in Table XII are a measure of the differences in the blood sugar curves 
resulting from the two insulin mixtures. Practice is necessary to obtain 
facility in interpreting the “difference” curve (Fig. I) without simul- 
taneous reference to the actual blood sugar curves for each mixture. 
Brief consideration, however, will clarify the general principles involved. 
A horizontal, straight line at the ‘0’ ordinate would indicate identical 
response to the two mixtures. A horizontal, straight line at any ordinate 
other than 0 would indicate parallel response, or a simple curve dis- 
placement. Any “difference” curve with a significant slope, or significant 
curvature, indicates fundamental differences in the time-response curves 
of the test preparations. 

For the two-period analysis the quadratic component was significant 
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at the five percent level. The linear component was significant at the 
five percent level for the three-period analysis and at the one percent 
level for the four-period analysis. The reason for this is readily apparent 
from Figure I. The curve of differences for the four bleeding times 
tends overall, toward linearity as the number of test periods increases 
from two to four. This probably was due to the rabbits becoming less 
sensitive to insulin in the last two periods of the experiment. It will be 
noted from Table II that the blood sugar levels for the fourth period 
consistently were higher than those for the other periods. 

Adjustment for initial blood sugar levels by covariance had no effect 
on the final results and in no case was there a significant reduction in 
error due to covariance. This was to be expected since a preliminary 
analysis of the initial values had shown that although there were signifi- 
cant period differences, the comparative levels were consistent for the 
two groups of rabbits from period to period. 

The real difference between the two mixtures was small. It may be 
fortuitous that this was found to be significant in the simple cross-over 
experiment. On the other hand, significant differences persisted as the 
experiment was prolonged to include switchback and double switchback 
designs, despite the apparent decrease in sensitivity of the rabbits. 

The extended cross-over designs yielded essentially the same results 
as the simple cross-over experiment. At all states of the investigation, 
there was evidence of a more rapid recovery of blood sugar levels in 
animals receiving the mixture with less protamine. The gain in dis- 
criminatory power resulting from the switchback designs does not seem 
to be commensurate with the additional cost, and danger of animal loss 
associated with these extended tests. Replicated simple cross-over de- 
signs would be more efficient from the standpoint of economies and might 
be expected to yield information of equal, or better, statistical validity. 

Thanks are due to Mr. Robert A. Harte, Research Administrator for 
Sharp & Dohme, for critical reading of the manuscript; to the referee for 
valuable suggestions as to the form of the analysis and the presentation 
of data; and to Dr. J. R. Monroe, University of North Carolina (State), 
for suggestions on the covariance procedure. 


REFERENCES 


. U.S. Pharmacopoeia XIV: 298, 1950. 

. Brandt, A. E., Jowa Ag. Exp. Station Res. Bull. 234, 1938. 

3. Fisher, R. A. and Yates, F., Statistical Tables for Biological, Agricultural and 
Medical Research, 3rd Ed., Hofner Pub. Co., Inc., N. Y., 1948. : 

4. Hemmingsen and Marks, Quart. J. Pharm. and Pharmacol., 5: 245, 1932. 

5. Bliss and Marks, Quart. J. Pharm. & Pharmacol., 12: 182, 1939. 


| ol ad 


A SAMPLING INVESTIGATION OF THE EFFICIENCY 
OF WEIGHTING INVERSELY AS THE ESTIMATED 
VARIANCE* 


WituraAmM G. CocHran AND SarRAH PorTER CARROLL 


Johns Hopkins, Baltimore, Md. 
and 
Institute of Statistics, Raleigh, N. C. 


1. INTRODUCTION 


Suppose that we have a number of estimates 2;(¢ = 1, 2, --: kh), 
normally and independently distributed about the same mean yw with 
different variances o; . If the values of the oc; are known, the best 
estimate of u is generally agreed to be the weighted mean 


k 
Ly = SB wx; /W,; where azine Ua aS Wi - 
i=1 $ 


If the o? are not known, but we possess estimated variances s; , 
based on n; degrees of freedom, respectively, analogy suggests the use 
of a weighted mean with weights inversely proportional to the estimated 
variances. This mean is 


k 
7 hee. “ 
Ey = >» b.2;/%, where 0; = REN ie Dee: 
i=1 ! 


Data of this kind may occur when & laboratories make separate 
determinations x; of the same physical or chemical quantity, each with 
an estimated standard error, or when a summary is being made of the 
results of & replicated experiments, in each of which the difference 2; 
between a specified pair of treatments has been observed. In practices 
it cannot be taken for granted that the observations «; are all estimate, 
of the same mean u, because personal biases or local conditions of ex- 
perimentation may render this assumption false. The discussion in 
this paper is confined to situations in which the assumption holds. 


*Research conducted under a contract with the Office of Naval Research. 
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Some results about the distribution of ~, are known. When the 
degrees of freedom n,; are all equal to n, the limiting distribution of 
@, , as the number of estimates k tends to infinity, is normal, (1), with 
mean pu and variance 


(n — 2) 
= 1 
V(Es) = aan (1) 
The proof requires that n > 8 and that the o; are bounded above 
and below. When the n, are not equal, the limiting variance takes 
the more complex form 


V@s) = { s (nee Vp 


For practical applications, these results may be used as approxima- 
tions when the number of estimates k that are being combined is large. 
Until recently, no information has been available as to how well the 
results apply when k is small. However, Meier (2) has given an approxi- 
mation to V(#,), valid for any k, but neglecting terms of order 1/n? . 
His result is 


(2) 


k 
Vs) =.4 E = we L 4 wi(w — w) | (3) 
Variance formulas (1), (2) and (8) are useful for comparing the 
precision of €; with that of other simple estimates of u—in particular 
with the unweighted mean of the x; . These formulas cannot be used, 
however, to attach a standard error to an actual value €, that has been 
obtained from a set of data, because the formulas involve the unknown 
true weights w; . For this purpose, Cochran (1) showed that an un- 
biased estimate of the limiting variance, when the n; are equal, is 


n 


Ves) = PESO (4) 


Similarly, Meier (2) has shown that an unbiased estimate of (3), 
neglecting terms of order 1/n‘: , is 


Ves) = 1144 es > bo = wv) | (6) 


The present paper gives the results of sampling investigations which 
were carried out by the j junior author (3) in order to learn something 
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about the variance of €; when n and & are both small. Although the 
scope of these investigations was restricted by the heavy computation 
involved, as is often the case with sampling studies, the results provide 
a partial check on the range of application of Meier’s formulas and give 
some information for values of n and k that are beyond this range. 


2. METHOD OF CALCULATION 


At first sight, the sampling investigations appeared a formidable 
task because of the multiplicity of variables. Even confining attention 
to the case where all n; are equal to n, it was desired to cover rather 
thoroughly the range of values of both n and k between 2 and 20. Then 
there was the problem of what sets of variances o; should be investigated. 

It appeared, however, that if the variance of %, was expressed in 
the form 

a f(n, k) 
KER yee (6) 
the factor f(n, k) would be relatively insensitive to variations in the o; . 

Several results support this conjecture. From equation (1) it 
follows that the limiting value of f(n, k), as k tends to infinity, is 
(n — 2)/(n — 4), for any bounded set of values of o; . Further, as n 
tends to infinity, for any fixed k, f(n, k) tends to 1, since the weights 
then become the correct weights. When k = 2, the correct variance 
can be obtained by numerical integration. Calculations of the variance 
by this method for a few sets of values of m, and n, (Porter, 1947) 
showed that f(n, k) changed by only a few per cent for ¢;/c2 lying be- 
tween 0.1 and 10. 

Finally, some sampling computations of the variance were made for 
three different sets of values of o2. In the first set, all o; were taken as 
1; in the second, the values were 1/2, 1 and 2, each value holding for 
one-third of the x,’s; in the third, the values were 1/4, 1 and 4. Results 
are shown in table 1 for k = 3, 6, 12 and 15 and for n = 6, 10 and 20. 

As the values of o? become more unequal, f(n, k) tends to decline. 
The decreases are small in all cases in table 1, the maximum drop being 
about 7 per cent for n = 6, k = 12 and 15. The results suggest that 
computations made for o; all equal will tend to give values of f(n, k) 
that are slightly too high but not far in error. Consequently, the 
principal calculations were made for the case in which all o; are equal to 

The procedure was as follows. In samples in which the 8; are fixed, 
and o? = 1, €, is normally distributed with mean p and variance 


Y w2/w? (7) 


i=1 
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TABLE 1 
Effect of inequality in o; on f(n, k) 
Values of f(n, k) 


k = number of estimates 
n o;? 3 6 12 15 
6 Qe ike ab) I PRs 1.39 1.54 1.59 
(YP4, 1h, 23) 1.22 1.39 Teo? 1.54 
(1/4, 1, 4) 1.18 lay, 1.43 1.47 
10 Gio ale, i) LS to? 1.28 1.30 
G25 152) tele fe 22 Le H/ 1.30 
(1/4, 1, 4) 1.11 1.19 1.25 p27 
20 GLeet Se) 1.06 1.10 112 et 
Cy251% 2) 1.05 1.09 1.11 eee 
(1/4, 1, 4) 1.04 1.08 eee RSE 


The values of s; , and hence of #; = 1/s; , were obtained by squaring 
and adding from a table of normal deviates (4). The values of 2%; were 
then grouped in sets of k, each set yielding one value of the conditional 
variance of £, by substitution in (7). Enough sets were computed for 
each n and k so that the mean value of the variance over the group 
appeared stable (the average coefficient of variation of the mean was 
1.9 per cent). Finally, since w = k when all o; are equal to 1, the 


factor f(n, k) is k times this mean variance, as can be seen from equa- 
tion (6). 


TABLE 2 
Values of f(n, k) such that V(ée) = f(n, k)/w 


k = number of estimates that are being combined 

n 2* z 3 4 5 6 8 10 12 15 20 co ft 

ee ee ae ee ee eee ee 
2 133° 2 (35 M61" 92) 217 Sad Forse sas) Sash 76 5.88 foe) 
4 1.20) 12225 1.36) 2.49 (1.61 “1e72" 11.929 9.18 9.33) 2.66 2586 oo 
6 DLA 16) 22 Ue0 ooo) ol sO) led bed AON 154 eso 1.64 2.00 
8 TLL LT 2023-28) es soe dee deso) ede leah aeaO 
10 1 O99) 109 U6: SS BA O0N Telok aos mah 7 a mosmTEo() 2-32-33 
12 108 S109 25s 1 16e Seed ae 20 al oon de os P23 ete} 
15 TOG 00 SOT 2 5 es eG Sado) ee 20 ee OOMeTS 
20 INOS 057716069 1:09 1010 21 1O™ ede ei ed 12 “11S ie loe at ao 


as es FS 


*These values obtained by the formula f(n, k) = (n+ 2)/(n-+ 1) 
{These values obtained by the formula f(n,©) = (n — 2)/(n — 4) 
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3. SAMPLING RESULTS FOR THE VARIANCE Ey 


The values obtained for f(n, k) are shown in table 2. For k = 2, 
with the o; all equal, the exact value of f(n, k) is easily found to be 
(n + 2)/(n + 1). These exact values appear in the first column of 
table 2: the corresponding values from the sampling investigation 
appear in the second column, and indicate good agreement with the 
exact values. 

For k = &, the values shown in table 2 are obtained from the formula 
(n — 2)/(n — 4)w for the variance of £, in the limiting distribution, as 
given previously in equation (1). 

Since the variance of Zs is 1/w when the weights are known exactly, 
the quantity f(n, k) is the factor by which the variance is inflated owing 
to errors in the estimated weights #; . Table 2 indicates that this 
inflation is less serious when & is small than when k is large. With 
weights based on 8 degrees of freedom, for instance, the variance is 
inflated by 50 per cent when many estimates are being combined, but 
only by 11 per cent when two estimates are being combined. 


4. COMPARISON WITH THE UNWEIGHTED MEAN 


A simple alternative to @, is the unweighted mean <. A comparison 
of the precisions of < and #, is of practical interest, because there is no 
point in undertaking the extra calculation involved in ¢, unless a 
reasonable gain in precision is anticipated. 

The situation most favorable to the unweighted mean is that when 
the a7 are all equal. In this event the unweighted mean is fully efficient. 
Consequently, the values of f(n, k) in table 2 indicate the maximum 
inflation in variance that will occur if ¢, is used in place of <. Since 
in practice we do not know by how much the a; vary, we might be willing 
to regard this inflation of the variance as a premium paid for insurance 
against the possibility that the o; vary greatly (in which event < would 
be of low efficiency). Table 2 suggests that if n exceeds 20, the premium 
is not high, but over most of the table the potential inflation of variance 
is unfortunately well over 10 per cent. 

More generally, the variance of @ is 


as compared with the approximate variance for the weighted mean, 


f(n, k) 


V(Es) = Ww 
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By comparing the two variances, working recommendations can be 
made about the use of the two estimates. The difficulty is, however, 
to know what amount of variation in the o; is typical of practical con- 
ditions. Comparisons will be given for the case k = 2. In this case 


(n + 2) 
(n+ 1)w’ 


where we have used the approximation to f(n, 2) given in the first 
column of table 2. 
Hence the relative precision of @, to < is 


pe Ww ae: 
V(z) => 4w, ws . V(Es) = 


mti1) w  Mm+)da+¢ 
(n+2)4ww, (n+2) 49 ’ 


where v, = W,/W2. = 03/0; , is the ratio of the variances of the two 
estimates 2, and z,. Table 3 shows the relative precision (in per cent) 
for a series of values of n and ¢. 


TABLE 3 


Relative precision (in per cent) of the weighted to the 
unweighted mean, for k = 2. 


eg = 02/0 i 
n 1 Led 2 3 4 6 
2 75 78 84 100 117 153 
4 83 87 94 ill 130 170 
6 88 91 98 117 137 179 
8 90 94 101 120 141 184 
10 92 96 103 122 143 187 
12 93 97 105 124 145 190 
20 95 100 107 127 149 195 
co 100 104 112 133 156 204 


If the variance ratio for the two estimates lies between 1 and 2 
the maximum possible gain in precision from the weighted mean is ae 
most 12 per cent, and the smaller values of n show a loss in precision. 
When the variance ratio exceeds 3, on the other hand, the weighted 
mean is superior, or as good, for all values of n down to 2, and the gains 
in precision may be substantial. 

To summarize, the unweighted mean is preferable if the ratio of the 
larger (true) variance to the smaller is not more than 2. If the ratio 
lies between 2 and 3, the unweighted mean appears preferable unless the 
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weights are each based on, say, at least 12 degrees of freedom. If the 
ratio exceeds 3, the weighted mean is preferable even if only 4 degrees 
of freedom are available to estimate the weights. 


5. COMPARISON WITH MEIER’S FORMULA 


Table 1 also provides a partial check on Meier’s approximate formula 
(3) for V(Z»), subject to the restrictions that the comparison covers only 
the case where the o; are equal and that the values in table 1 are them- 
selves subject to some sampling error. When all w, are equal and all 
n,; are equal, Meier’s formula reduces to 


V@.) = + 1 Vee (8) 


Ww nk 


The ratios of the variances in (8) to those in table 2 are shown in 
table 4. 


TABLE 4 
Ratio of variance given by Meier’s formula to variance in 
Table 2 
k = number of estimates 
n 2 3 4 5 6 8 10 12 15 20 <0 
2 1.13 1.04 91 ~=—«.83 75 ~=—««. 66 55 50 4l 33 00 
4 1.04 99 93 «£87 838.75 68 63 57 52 , .00 
6 1.03 1.00 96 =.94 92 ~.89 87 85 82 80 66 
8 1.01 97 97 =.94 Oba aol 89 88 86 86 83 
10 1.01 98 ie = Myre 96 ~=.94 93 92 92 90 90 
12 1.00 99 (aes, ee ee 96 96 94 94 94 
15 1.01 99 98 .96 97 97 95 94 93 94 96 
20 L007 1501 99 =. 98 98 .98 98 97 97 98 98 


From inspection of table 4, Meier’s formula appears to underestimate 
the true variance, the relative underestimation increasing as k increases. 
If we are willing to regard a 6 per cent underestimation of the variance as 
tolerable, table 5, derived from table 4, shows the smallest values of n 
for which Meier’s formula is satisfactory in this sense. 

When at most 5 estimates are being combined, the sampling in- 
vestigation suggests that Meier’s approximation does remarkably well, 
being satisfactory for values of n as low as 4 or 6. 

The increase in the underestimation by Meier’s formula when k 


—' 
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TABLE 5 
Smallest values of n for which Meier’s formula underestimates 
by less than 6 per cent. 


Number of estimates, 2 3 4 5 6 8 =10 


Smallest no. of d.f., n 4 4 6 6 8 10 12 


becomes large can be attributed to the effect of terms in 1 /n’ and 
higher orders. As k — ~, Meier’s formula gives 


V(és) = LG + 2) 


On the other hand, the correct limiting variance, by formula (1), may be 
written 


; ae LT 2,8 , 32 oS) 


6. COMPARISON WITH MEIER’S FORMULA FOR THE ESTIMATED 
VARIANCE OF #, 
The sampling data were also used to investigate the performance of 
Meier’s formula (5) for the estimated variance of <; . The procedure 
was as follows. The formula reads 


Ves) "=. 5 E ode oe | (5) 
W W a1 N; 
For any specified n and k, a large number of sets of k independent values 
of w,; , each derived from n degrees of freedom, had already been as- 
sembled for the determination of f(n, k) as described in section 2. By 
substitution in formula (5), each set provided one sample value of 
v(€s). The average o(@4) of this quantity, taken over all the sets, is an 
estimate of the true mean given by Meier’s formula. The ratio of 
v(Zs) to the variance of €, as found from the same group of sets, i.e. 
to f(n, k)/w, was then computed. The argument is that if Meier’s 
formula is unbiased, these ratios should fluctuate about a value close 
to 1. As before, the comparison is restricted to the case where the 
o; are all equal and the n; are all equal. 

The ratios are shown in table 6. Calculations were made only ia 
n 2 6, since Meier’s formula, which neglects terms of order 1/n”, was 
not expected to be valid for n < 6. 


‘ 
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TABLE 6 
Ratio of average value of Meier’s estimated variance to variance in 
Table 1 
k = number of estimates 
n 2 3 + 5 6 8 10 12 15 20 
6 96 93 83 83 =. 81 4G 75 72 70 68 
8 1.01 91 92 Sie. 82 81 81 78 76 
10 1.01 97 95 93 .92 89 89 88 87 86 
12 99 98 97 95 .94 94 92 92 89 90 
15 .97 .98 99 94 .93 .98 .93 oye TD) 
20 OIL OL ae 9S, 2997 98-96" F995 O78 99's. 96 


For k = 2, table 6 indicates that Meier’s formula does extremely well 
down to n = 6. For higher values of k, the formula appears to under- 
estimate to a greater degree than the corresponding formula for the true 
variance (table 4). If, as in section 5, we accept an underestimation by 
6 per cent or less, the smallest values of n for which the formula is 
satisfactory are shown below for the different values of k. 


Number of estimates, k 2 3 4 5 6 8 >10 


Smallest no. of d.f., n 6 10 10 12 12 12 20 


As with the formula for the true variance, the underestimation can be 
attributed to the effects of neglected terms of higher order in 1/n. In 
the limiting distribution when k — ~, the mean value of Meier’s formula 
(5) can be shown to be 


@=2 (144) r 


nw n 


The first term, (n — 2)/nw, is the mean value of 1 /w in (5): the second 
term is the mean value of the expression inside the square brackets in (5). 

From equation (1), the correct limiting variance of Ey is (n — 2)/ 
(n — 4)w. For comparison with (9), this may be written 


ice 4) (10) 


nw n—A 
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Inspection of (9) and (10) suggests that for large k, Meier’s formula 
would be relatively free from bias if the terms in 1/n; were changed to 
terms in 1/(n; — 4). For k = 2, on the other hand, the formula seems 
excellent as it stands. 

As an empirical attempt to improve the performance of the formula, 
we considered replacing the quantities n; by quantities n , where 


n= - D+ qa =m Gay er 


This substitution leaves the formula unchanged when k = 2, but gives 
it the correct mean value in the limiting distribution as k > o. 

From the sampling data, the average value of the adjusted formula 
was worked out for each n and k, in exactly the same way as for the 
original formula. The ratios of these average values to the true variance 
as estimated from the sampling data are shown in table 7. Thus, 
table 7 presents the same data for the adjusted formula as did table 6 
for the original formula. 


TABLE 7 


Ratio of average value of the adjusted Meier’s formula 
for the estimated variance to the variance in Table 1. 


k = number of estimates 
n 2 3 4 5 6 8 10 12 15 20 
6 "96. 1206) 017042 sLe102 12S 1a 4s Sa ee ee er SS 
8 LEO] 98.012045 01,01. 1-032 1, 0020 0L 101 .99 .98 
10 LOL R02 Se OSee 0d a O2k ee OF 02 sala Ole eOOMIE OT 
12 JOO-eel Ole eOs. 1eOlanin Olas 120251 00st Ol .99 1.00 
15 29%. 1200521 02% a 98) 6 98) ne309) 21h 00 07 ae OG ae OT, 
20 Pe Ohesd02" -1.00" W014 10 29901 035 0s ie 03 ade 0G 


The adjusted formula appears very satisfactory down to n = 8. 
For n = 6, the adjusted formula works tolerably well for k < 4, but for 
larger values of k it gives too high a variance. 


7. NUMERICAL EXAMPLE 


The application of the adjusted formula will be illustrated by the 
example presented by Meier. The data, from a paper by Snedecor (5), 
give the percentage of albumin in the plasma protein of normal human. 
subjects, as obtained in 4 different experiments. The relevant figures 
appear in table 8. 
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TABLE 8 


Illustration of the adjusted formula 


Column 

(1) (2) (3) (4) (5) 

s;? wD; — 1/s;? n; wo — w; n,;' =n; —2.667 
1.0822 0.9241 11 2.9869 8.333 
0.5227 | 1.9133 14 1.9977 11.333 
4.7761 0.2094 6 3.7016 3.333 
1.1571 0.86438 15 3.0467 12.333 

w=3.9110 


Columns (1)-(3) contain the basic data. Column (4) is formed 
from column (2). For the n/ , we have from (11) 


8 
= —— 4 f ieee a — he a 67. 
u (eer N: 3 n 6 


These values appear in column (5). Finally, from the adjusted 
form of equation (5), 


v(€ 5) =1l14 a2 St ae 0 | 


4 {(0-S24@.9869) A aie 


(3.9110)* 8.333 


CE 
12.333 


- 0.2557| 1 a 


= 0.3302. 


This is about 6 per cent higher than Meier’s value of 0.3111 as found by 
the original formula (5). For the approximate number of degrees of 
freedom to be ascribed to this variance, Meier has suggested 


This comes out to 38.6 for these data. 
Before the publication of Meier’s formula, we had constructed an 


\ 
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empirical formula for the estimated variance, based on the results of 
the sampling investigation as follows: 


Nae ie nlk(w — 2) + 8] 
Es) = Gq — 2) [KH — 4) + 12] 


where 7 is the average number of degrees of freedom in the k estimates. 
This formula was obtained by fitting a simple algebraic function to the 
values of f(n, k) which we found. It is subject to the same restriction as 
the sampling studies, in that it assumes f(n, k) to be independent of the 
values of the o; , whereas Meier’s formula has a sounder theoretical 
basis. ; 

Since Bliss (6), has used this formula (with acknowledgement) in 
one of his publications, it may be well to remark that down to n; = 6 
the formula agrees well enough with the adjusted Meier formula in the 
cases in which we have checked it, being slightly more conservative. 
In the present example we have #7 = 11.5, k = 4, and the formula 
gives 0.339 for the estimated variance. 


SUMMARY 


We are given k independent estimates x;(i = 1, 2, --- k) of the same 
mean uw. The estimates are thought to be of unequal precision, and for 
the 2th estimate we have an unbiased estimate s; of its variance o; , 
based on n,; degrees of freedom. This paper describes the results of a 
sampling investigation undertaken some years ago in order to study the 
variance of the weighted mean 


Ey =") d,2i/ One where Ds = ise 


The variances were obtained for values of k between 2 and 20, and for 
values of n; (assumed all equal) between 2 and 20. The variances found 
from the sampling investigation were expressed in the form f(n, k)/w, 
where w = }71/o; . Since there is reason to believe that the factor — 
f(n, k) is relatively independent of the o? , the sampling computations 
were made for the case in which all o? are equal. 

Since f(n, k) = 1 when the correct weights 1/o; are used, the factor 
i (n, k) gives a measure of the extent to which the variance of Z, is 
inflated owing to sampling errors in the weights %; . For given n | 
f (n, k) increases steadily as k increases, so that the inflation of variant 
is smallest when only a few estimates are being combined. 

The results for the variance of &» enable its precision to be compared 
with that of the unweighted mean % When k = 2, taking o} as the 
larger variance, @ is preferable if o2/c2 < 2, while , is preferable, for 
any value of n down to 4, if 3/0? > 3. If this ratio lies between 2 eat 3, 
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£, appears preferable if the weights are based on at least 12 degrees of 
freedom each. 

The results provide a partial check on approximate formulas recently 
developed by Meier for the variance and the estimated variance of Tae 
In these formulas, terms in 1/n; are ignored. The comparisons suggest 
that if 5 or fewer estimates are being combined, Meier’s formula for the 
true variance is satisfactory for values of n down to 6. It is satisfactory 
for any number of estimates if n is at least 12. 

Meier’s formula for the estimated variance 


1 Ne lo eee 
wo {1 +r 0 dX n; sd wy} 

appears adequate down to about n = 12, although it tends to be an 
underestimate. An empirical adjustment, which mades its performance 
adequate down to n = 8, is to replace n; in the formula by 


a SB Ak = 2). 
i eal eet) 
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THE RANDOM WALK OF 
TRICHOSTRONGYLUS RETORTAEFORMIS 


S. R. BroapBENT AND Davip G. KENDALL 


Magdalen College, 
Oxford 


This paper examines the behaviour of certain larvae in terms of a 
random walk of the “Brownian motion” type, and places on record 
the solution to a problem suggested by their characteristics. 

The larvae of the helminth Trichostrongylus retortaeformis are 
hatched from eggs in the excreta of sheep or rabbits, and wander ap- 
parently at random until they climb and remain on blades of grass 
where they are eaten by another animal, in whose intestines the cycle 
recommences. The question considered here is: what is the distribution 
of the larvae thus trapped on blades of grass? 


1. THE RANDOM WALK 


On the assumption that the x and y coordinates of a larva, measured 
on a plane with origin at the point of release, are independent Gaussian 
variables with mean zero and variance o”, their joint distribution is 


1 x” F 
siz ew | - FH" | ae ay (-wo< zy <o@), 


Transforming to polar coordinates and integrating with regard to @, 
we find the marginal density at radius r to be 


\ r yr 
Lew|-23|a (0 <r <o), 


and thus we get the expected proportion contained in a circle of radiul 
r (the radial cumulative distribution) to be 


r 
P, = 1 — exp - 5. 


If now we assume that this distributi 
on results from a rand 
of the ‘Brownian motion” type we have the relation a 
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c¢ =al 


where w is a diffusion constant. (It is the rate of increase of the variance, 
o.) Then at time ¢ the density at radius r will be 


r r 
MAG es | dy (07 <0) (1) 


and the expected proportion in a circle of radius r will be 


Pt) = 1 — exp E | 
“wy 


2. TRAPPING 


2.1. We now introduce as a first hypothesis the assumption that a 
larva, while performing the random walk, may in any short interval 
6t with probability \é¢ come upon a blade of grass and climb up it. 
Further, having gone up this channel, it is unable to turn round and 
must stay there. 

We write II(¢) for the probability that a larva is trapped on a blade 
of grass in the time interval (0, #); if r(¢) dé is the distribution of time- 
to-trapping (the period of free motion) then 


t 


Wi) = fi x(t) dt 0 < heaay 


The probability that a larva is trapped in the interval (0, ¢ + 6¢) 
is the sum of II(¢) and the joint probability that the larva is free at 
time ¢ and trapped in the interval (¢, ¢ + dt): 


T(t + 5¢) = W(t) + {1 — T(A)]A ae. 


Ee af i fs x(t) ar]. 


The solution of this equation is 
a(t) = de ™ 0 < f< ©); (3) 
The final distribution of the trapped larvae will be given by mul- 


tiplying together the expressions (1) and (3) and integrating the result 
with regard to ¢. We obtain 


Ko(p)p dp (OS p< 2)2 @) 


where p = rV 2X/w, and K,(p) is a standard Bessel function tabulated, 
for example, by Watson (1). 


Hence 
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Using the differential equation, 
oKi(p) + Ki(p) + pKole) = 9, 


we can integrate the p-distribution to get the radial cumulative distri- 


bution, 
F(p) = prob E > | = pK,(o) (5) 
A short table of values of F(p) is given below. 


TABLE I 


0.00 | 0.34 | 0.56 | 0.78 | 1.01 | 1.26 | 1.55 | 1.91 | 2.41 | 3.21 


ll 


p 


F(p).= (1.0/0.9, | 0.8 | 0.7 | 0.6 | 0.5 |.0:4 | 0:3) 50.2 QO 


2.2. Asasecond hypothesis we assume that a larva is not necessarily 
trapped by the first blade of grass visited, but that it has a probability 
p of being trapped when it visits a blade of grass, and that now péé 
is the probability that a free larva will visit a blade of grass in a short 
interval 6é¢. 

If we define for this case II,(é) and 7,(¢) dé as in 2.1, the probability 
that a larva is trapped in the interval (0, ¢ + 6f) is now the sum of 
1I,() and the joint probability that the larva is free at time ¢, visits a 
blade of grass in the interval (f, ¢ + 6¢), and is trapped there. That is, 


T(é + 6t) = In() + fl — TI, (é) |u St p. 


The solution is now 


m(t) dt = pue ™' (0<t<o), 
The formulae (4) and (5) apply as before, with \ = DL. 
3. ESTIMATION 


3. it we are given a sample of r-values observed at a given time t 
it is a simple matter to estimate the “one-dimensional varianee-rate” 
w, for this is equivalent to estimating the variance wt of a cingular 
Gaussian population. We shall have 


Ds ri = wtyoy 
(if the sample is of size N), so that 
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8S oy meh (6) 
will be an unbiased sufficient estimate of « with sampling variance w’/N. 


3.2. In the data considered below, the numbers of larvae in con- 
centric annuli at certain times are given. To get the estimate (6) we 
must evaluate >> r; for a particular time, when the values of r may be 
taken as independent. If there are n, larvae in the annulus bounded 
by circles of radius p and p + 1 units of length (}> n, = N), and if we 
assume that they are uniformly distributed over the annulus, we find 


x Myl(p + 3) + 4] (7) 
as an estimate of >> r?. 


For in the annulus (p, p + 1) the equivalent uniform density in 
numbers of larvae per unit area is given by 


D, = n,/[r(2p + 1], 


and Dr; over this annulus is therefore 


pt+l 
[ ut-2eud, du = nlp + 9° + BL 


Equation (7) follows by summation over p. 

There is therefore a correction of amount +N/4 to be made to the 
value of >>r; obtained by assuming the larvae in each annulus to be 
located midway between the bounding circles. 


4, DATA 


4.1. In Table II we give the data kindly made available to us by 
Dr. H. D. Crofton. In one experiment Crofton observed the wanderings 
of a total of 400 larvae; only a small number were placed at one time 
on the horizontal microscope slide. The larvae were released at the 
centre of the field of view, and at successive intervals of 5 seconds (up 
to a total of 30 seconds) the numbers of larvae in concentric annuli of 
radii 1, 2, --: , 12 units of 0.7 mm. were counted. 


4.2. If the data satisfy relation (2), log [1 — P,(#)] plotted against 
7?/t should give a straight line passing through the origin, where P,(¢) 
is the observed proportion of the number of larvae which lie within a 
circle of radius r units at time ¢. The slope of this line is inversely 
proportional to the parameter w. The graph of these points is given in 
Figure I; a straight line through the origin is a reasonable first approxi- 


mation. 
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FIG | 


(CROFTON'S DATA) 


Los [i- R(t] CY) acainst r/t (X): see §4:2 


KEY: FIGURE | REPRESENTS L=5 SECONDS, 
2 Lei0 ETc. 


: 4.3. The estimates of w for the 
independent) using (7) are 


SIX successive times (which are not 
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t= FF tage ota O 15 20 25 | 30 seconds 


The arithmetic mean of these estimates is 0.54. The standard error of 
each estimate at each time is about 0.027. 


TABLE II 
Wandering of larvae of Trichostrongylus retortaeformis 
in a horizontal plane 
(Crofton’s data) 


Number of larvae in annulus with outer radius 
Time in 
seconds 1 2 3 4 5 6 f 8 9 LOPS Leen 2 
5 53 124 118 81 21 3 ‘ : 
10 32 60 142 88 41° 20 11 6 : 
15 20" 31. 169) .124--7 45 31 : 2 ‘i 
20 11 40 65 94 96 47 19 6 10 7 5 " 
25 its 33) 4547 47287. 69 487. 20 8 8 ? 1 
30 iS pecee tS ae 20s 90S eek «17 3 7 


Notes: (i) the unit of length is 0.7 mm. 
Gi) the first five row-totals are each 400; the last-is 399. 


4.4. It will be seen from Figure I that there is some evidence of 
systematic deviation from a straight line. The deviation becomes 
more prominent when we compare the entries in Table II with the 
values given by (2), using the above estimates of w. It is only at t = 5 
seconds that the value of x’ for this comparison is within the 5% signifi- 
cance level; at all other times it is well beyond the 1% significance level. 

It also seems probable that the values of w decrease with increasing 
time. Even though the six estimates are not independent, their range 
is about 7.1 times their estimated standard error. 

For these two reasons it is doubtful whether the model is adequate 
to describe the data fully. It may, however, be of some interest as an 
approximation. 

The authors wish to thank Dr. H. D. Crofton for permission to 
publish his observations, Dr. F. C. Frank who first suggested that it 
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might be worthwhile examining the problem from the present standpoint, 
and Dr. D. J. Finney for some helpful comments on the method of 


analysis. © 
SUMMARY 


The distribution of larvae which are trapped on blades of grass 
while performing a random walk of the “Brownian motion” type is” 
derived. The adequacy of such a random walk as a model for the 
wandering of larvae of Trichostrongylus retortaeformis is discussed in — 
relation to empirical data supplied by Dr. H. D. Crofton. ~ 


REFERENCE | 
(1) Watson, G. N., Theory of Bessel Functions (Cambridge, 1922). aed . 
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THE ANGULAR TRANSFORMATION IN QUANTAL ANALYSIS 


P. J. CLARINGBOLD, J. D. Bracers, AnD C. W. EMMENS 


Department of Veterinary Physiology, 
University of Sydney, N.S.W., Australia 


SUMMARY 


The angular transformation may be used in two ways in the analysis 
of quantal data. Transformation of the observed response (Eisenhart, 
1947) leads to a quick noniterative but approximate solution. If the 
expected response is transformed, an exact iterative maximum likelihood 
solution is available. Comparisons have been made which indicate the 
practical similarity of the two methods, though where additional 
accuracy is required one cycle of the maximum likelihood solution 
following the method of Eisenhart seems all that is required. 

To overcome difficulties with regions of 0 and 100% response in 
factorial experiments, parallelogram designs have been introduced. 


1. INTRODUCTION 


Two main approaches are available for the analysis of binomial 
(or quantal) data. The first is by a multiple regression technique 
based on the equivalent deviate transformation of Finney (1949), 
which originated in discussions of the pioneer studies of Gaddum (1933), 
Hemmingsen (1933) and Bliss (1934, 1935a, b), and is widely used in 
bioassay work. The second, leading to an analysis of variance, relies 
directly on appropriate transformation of the data. Where the data 
cover a wide range of proportions the angular (or inverse sine) trans- 
formation is relevant. Although the latter method has been applied 
in agricultural and operational research (Bliss 1937, 1938; Cochran 
1938, 1940; Eisenhart, 1947), and is described in several textbooks 
(Snedecor, 1946; Johnson, 1949; Brownlee, 1949 and Mather, 1949), 
it has not been widely used. The purpose of this paper is to discuss 
the application of the angular transformation in the design and analysis 
of multifactor experiments where the response is in quantal form. 
Examples will be taken, for illustrative purposes, from investigations 
into the response of the vaginal epithelium of ovariectomized mice to 


oestrogens. 
467 
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29, THE ANGULAR TRANSFORMATION 


Since the angular transformation was introduced by Fisher in 1922 
its development has taken two different directions; first as a special 
case of the equivalent deviate transformation, and secondly for the 
equalization of variance where the variance is dependent on the mean. 

Consider a large binomial population in which a proportion P 
possesses a given attribute. Let pi@ = 1 --: m) be the proportions 
of this attribute in successive samples of size n; 


the expectation of p = H(p) = P, 
and the variance of p = V(p) = PQ/n, where Q = 1 — P. 


2.1. In the equivalent deviate transformation. 


Finney (1949, 1952c) has discussed the equivalent deviate trans- 
formation. For any specified function f(v), a quantity Y, the equivalent 
deviate of P, may be defined as a monotonic increasing function of P 
by the equation 


je [10 w. 


A well-known substitution in this equation gives the probit transforma- 


tion. By another substitution the angular transformation may be 
derived, and is given by 


Pees oh oe Ore ecu Oras (1) 


The analysis following this transformation may be carried out in 
one of two ways, (i) as a multiple regression, (ii) as an analysis of 
variance, both methods being iterative. The former method is com- 
puted in an analogous manner to the probit plane technique described 
by Finney (1952a) with the added advantage from the computational 
point of view that weights are equal provided n is constant. The 
latter method has been described by Cochran (1940) and follows the 
usual analysis of variance procedures on the working angle values for 
each group. The working angles may be computed using the table of 
hare working angle and range given by Fisher and Yates (1948) 
0) lowing estimation of provisional values by graphical or other ened 

If n is not constant, weighted analyses (weight proportional to a 
are carried out, but owing to the loss of orthogonality the advantages 


of the analysis of variance are to a large extent lost 


2.2. In the equalization of variance. 


Curtiss (1948) derived the general theorems for the simplification of 
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variance of random variables where the variance is dependent on the 
mean. Both Curtiss, and Eisenhart (1947) have applied the general 


theorem to the binomial case and have shown that the transformation 
defined by 


¢ = o(p) = 2aresin Vp, (2) 


where 0 < p <1, and 0 < ¢ < = (radian measure), fulfils the re- 
quirements. The form of the transformation suggested by Yates to 
Bartlett (1936), and by Fisher to Bliss (1937) was 


p = sin’ Y, where 0 < p < 1, 0 < Y < 90° (degree measure) (3) 


v(x) = 820-7 4 (2) (4) 

n n/. 
When 7 is small it is found that in the extreme ranges of proportions 
the variance of Y is still largely dependent on p. Bartlett (1936, 1937) 
introduced an empirical correction factor when sample proportions 
are 0 and 1, and these have since been derived graphically by Eisenhart 
(1947). The transformation incorporating Bartlett’s adjustment be- 
comes 


Y,(p) = arcsin Vp, where 0<p <1, 0< Y < 90° ) 
Y,(0) = aresin V1/4n (5) 
Y,(1) = 90 — Y;(0). j 


From all practical points of view the transformation defined by (5) 
fulfills the requirement that the variance is independent of the mean, 
provided 0.05 < P < 0.95, and n > 10 and is constant for all samples. 
Eisenhart (1947) has calculated the variance at P = 0.5 for small 
samples (see Table 1). 

An independent study of the binomial variable has been made by 
Ghurye (1949) who arrived at transformations for the equalization of 
variance practically equivalent to the above. 

Following these transformations a one stage analysis of variance is 
carried out. 


2.3. Theoretical comparison of the two approaches. 


The binomial variate poses two problems which must be overcome 
before it may be used in small sample estimation problems, (1) dis- 
continuity, (2) information depending on the mean. The discontinuity 
may be overcome by replacing the discontinuous observations by a 
continuous distribution of working responses. These are based on the 
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TABLE 1 
Table of Bartlett’s correction and variance of Y,z at P = 0.5 for sample size 10-50 
inclusive. The table is in degrees to facilitate use of the correction with the 


transformation as tabulated by Bliss (1937b) and has been prepared from the table 
by Eisenhart (1947). 


Sample} Y,(0) Yp(1) Vo.sYz | Sample Y,(0) Yp(1) Vo.sY¥B 

size size 
10 9.10 80.90 92.4 30 5.24 84.76 28.3 
11 8.67 81.33 83.0 31 5.15 84.85 27.4 
12 8.30 81.70 75.3 32 5.07 84.93 26.5 
13 7.97 82.03 68.9 33 4.99 85.01 25.7 
14 7.68 82.32 63.6 34 4.92 85.08 24.9 
15 7.42 82.58 59.0 35 4.85 85.15 24.1 
16 7.18 82.82 55.0 36 4.78 85.22 23.5 
W?/ 6.96 83.04 51.5 37 4.72 85.28 22.8 
18 6.77 83.23 48.5 38 4.65 85.35 22-2 
19 6.59 83.41 45.8 39 4.59 85.41 21.6 
20 6.42 83.58 43.3 40 4.54 85.46 21.1 
21 6.27 83.73 41.1 41 4.48 85.52 20.5 
22 6.12 83.88 39.2 42 4.43 85.57 20.0 
23 5.98 84.02 37.4 43 4.37 85.63 19.5 
24 5.86 84.14 35.8 44 4.32 85.68 19.1 
25 5.74 84.26 34.2 45 4.27 85.73 18.6 
26 5.63 84.37 32.9 46 4.23 85.77 18.2 
27 5.52 84.48 31.6 47 4.18 85.82 17.8 
28 5.42 84.58 30.5 48 4.14 85.86 17.5 
29 5.33 84.67 29.4 49 4.10 85.90 aoa 
50 4.05 85.95 16.7 


For values n > 50 theoretical variance is given by VY, = 820.7/n. If Bartlett’s 
correction required 


Y,(0) = aresin Ae 
Ys(hg=t00es) Ye) 


expected responses derived from consideration of the collective data 
of the experiment (Fisher, 1935a). By a simple non-linear change of 
the scale of measurement the information may be rendered constant 
and dependent only on the size of the group, and the angular trans- 
formation was introduced for this purpose. The combined use of the 
angular transformation and the notion of working responses gives an 


exact maximum likelihood solution to th m wi 
e e € 1 
exact fiducial inference is possible. TS a ~ ee 
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An approximate solution is provided by the change of scale only 
(with arbitrary corrections) and ignoring the effect of discontinuity 
(2.2.). In this case homoscedasticity has been imposed on the data 
only for large samples where the effect of discontinuity is unimportant. 
For small samples, however, the information is still dependent on the 
mean. Under these conditions exact fiducial inference is not possible 
(Fisher, 1935b). Since the approximate method is so easy to compute 
it is of importance to ascertain its efficiency, and in section 3 this 
question has been answered by a practical comparison of the two 
methods. 


2.4. Comparison with other transformations. 


Several transformations have been advocated for the analysis of 
quantal data. Although the probit has been recommended as most 
suitable in bioassay (e.g. Bartlett, 1947) practical comparisons between 
the probit, angular, logistic and rectangular transformations by efficient 
methods have detected no important differences between them (Finney, 
1947, 1952c; Biggers, 1951). Both Finney and Berkson (1949) have 
discussed the iterative solutions with reference to the logit transforma- 
tion and recently Dyke and Patterson (1952) have introduced a basically 
similar method for factorial experiments, also using the logit trans- 
formation. Finney (1952c) has discussed in more detail comparisons 
between these transformations and comes to the conclusion that all 
but the rectangular are very nearly the same between 0.02 < P < 0.98. 
Thus, when the data cover a wide range of proportions no one trans- 
formation will usually give a significantly better fit than another, and 
the choice will therefore rest on practical expediency. 

Other reasons, usually put forward particularly with reference to 
to the probit transformation, are concerned with its supposedly greater 
correspondence with reality. This seems of no consequence when the 
object of a test or bioassay is demonstrably served quite as efficiently 
by the use of much simpler mathematical models. The infinity of 
equally valid models has been discussed by many previous authors 
(c.f. Emmens, 1948). 


3. PRACTICAL EVALUATION OF THE METHODS 


3.1. A simple 4° experiment to illustrate methods. 

During studies on the effect of metabolic inhibitors on the vaginal 
response to oestrogens it was necessary to ascertain whether the response 
was purely additive or whether interactions occurred. Table 2 shows 
the design and the results of the experiment. 
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TABLE 2 
The effect of locally administered monoiodoacetate and oestrone on the percentage 
response of ovariectomized mice. 


Oestrone (1074 yg.) Monoiodoacetate (ug.-) 
12.5:-3)* 25 .0(-1) 50.0.1) 100.0,3) 
2-3) 25 15 15 0 
4(-1) 25 15 20 5 
8a) 45 35 15 15 
16:3) 70 50 35 20 


Number of animals per group = 20 


*Logarithmic coding shown as a subscript in parenthesis. 


3.11. Analysis of variance of working responses. 


The observed percentage responses were transformed by Bliss’ 
(1937) table and plotted over the abscissa log dose oestrone plus log 
dose monoiodoacetate. A regression plane was fitted by eye and the pro- 
visional values obtained. The method is similar to the probit plane tech- 
nique of Finney (1952a). These provisional values were corrected by the 
corresponding maximum working angle and range tabulated by Fisher 
and Yates (1948) to produce the working angles on which a standard 
analysis of variance was carried out (Table 3). The theoretical variance 


TABLE 3 


Analyses of variance (3.11) of transformed data of Table 2. 
0eo0@wvowowoaeee eee 


Source of variation Df Sum of Mean F te 
squares square 
Oestrone (3) (1030.3) 

Linear 1 982.1 23.9 <0.001 
ae 1 47.2 1.2 >0.05 
‘ubic 1 
Monoiodoacetate (8) (1123.0) sos aa saat 

Linear 4 1 1087.1 26.5 <0.001 

Quadratic 1 11.4 0.3 >0.05 
ee 1 24.5 0.6 >0.05 

teractions ; 9 208.4 23.2 0.6 >0.05 
Theoretical variance © 41.0 
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is given by 820.7 + 20. A test of significance was first carried out 
between the interaction mean square with 9 degrees of freedom and 
the theoretical variance with infinite degrees of freedom. Since no 
significant difference was found the mean squares of individual items 
of Table 3 were tested against the theoretical variance. 

Throughout this paper the variance ratio test will be used, although 
in the case of a theoretical variance with infinite degrees of freedom 
the test degenerates to a x°. If the error variance is significantly 
different from the theoretical variance the former must be used to 
examine the main effects. In these circumstances the F test is necessary, 
and in order to avoid the use of both tests in the analysis, the more 
general one has been employed. 

Orthogonal polynomials were used to separate the oestrone and 
monoiodoacetate effects respectively into their linear, quadratic and 
cubic components. In each case, by applying the methods described 
by Bliss and Marks (1939)*, the linear components were used to estimate 
the regression coefficients and their variances. From these regression 
coefficients, and the general mean response, the following equation was 
obtained 


Y = 7.06X, — 7.41X, + 29.92, 
where Y is the angle of response, 
X, is the logarithm of the dose of oestrone, 
X, is the logarithm of the dose of monoiodoacetate, 
b, is the regression coefficient of X, , 
b, is the regression coefficient of X, . 


The iterative procedure was continued until the difference between 
successive regression coefficients was less than 1/10th of the standard 
errors of the final coefficients respectively (Fisher and Yates, 1948). 
The final equation was 


Y = 7.01X, — 7.38X, + 28.98, 


and this was used to estimate the expected number of positive reactors 
in each group in order to calculate x” for goodness of fit by the “long- 


*In the original publication of Bliss and Marks the equation for the variance of the regression 
coefficient for an even number of groups is given by mistake as 
2Ve 4Ve 
Vb= r instead of 5 
PSnpky? P2Snyky? 
where—Vb is the variance of the regression coefficient, ; 
Ve is the variance associated with random sampling (it this case the theoretical variance) 
I is the logarithmic interval, 
Ny is the number per observation (in this case unity), 
ky is the orthogonal coefficient. 
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TABLE 4 
Analysis of variance (3.12) of transformed data of Table 2. 


Source of variation 1D Be Sum of Mean F Iz 
squares square 
Oestrone (3) (1038.3) 
Linear 1 978.6 22.6 <0.001 
Quadratic 1 57.8 isa >0.05 
Cubic 1 1.9 0.0 >0.05 
Monoiodoacetate (3) (1063.1) 
Linear 1 1044.0 24.1 <0.001 
Quadratic 1 3.8 Om >0.05 
Cubic 1 15.3 0.4 >0.05 
Interaction 9 181.3 20.2 0.5 >0.05 
Theoretical variance ) 43.3 


hand” method (see Emmens, 1948). Grouping of extremely low or 
high expected responses was carried out as described by Finney (1952a), 
dropping a corresponding number of degrees of freedom. Of the re- 
maining degrees of freedom three are used in the estimation of param- 


eters. Thus the x” value has 7 degrees of freedom and is 3.64 (0.9 > 
a2 0.8.)¢ 


3.12. Analysis of variance of empirical responses. 


After the observed percentage responses were converted to the 
angular values by means of Bliss’ (1937) table, and the value of Bartlett’s 
correction for the zero response inserted from Table 1, a standard 


analysis of variance was carried out (Table 4). The theoretical variance 
for n = 20 was obtained from Table 1. 


The following equation was obtained from the analysis 


Y = 7.00X, ae tadors + 28.94, 


and a goodness of fit test applied (x?,, = 4:00,,0.8 >. Pi> 0.2): 


3.2. A comparison of estimated regression coefficients. 


A series of factorial experiments have been analysed using both 
methods for the purposes of comparison. The method 3.12 was applied 
first and used to provide provisional values for analysis in two cycles 
by method 3.11. Regression coefficients were calculated-at each stage 
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and are shown in Table 5. The vertical differences between each of 
the eighteen sets of regressions are of no consequence as they do not 
represent estimates of the same parameter. 


TABLE 5 
Comparative values of regression coefficients and x? for goodness of fit obtained in 
six factorial experiments analysed by the method of Eisenhart (1947) followed by 
two cycles of the Fisher-Bliss method. 


Exper- Eisenhart Fisher-Bliss 
iment Se So 
No. b 33 First | Second Sp 
cycle cycle 
b b 
1. 1.6 | 2.4 1.8 1-6 62.3) 
97. ice. 12.6 12.6 | 2.0} 
—6.9 2.4 Xi16) =) iis —6.4 —6.4 2.0 xi161 = 14.0 
—11.4 | 2.1/0.5 > P > 0.3} —11.57—11.5 | 2.0)0.5 > P > 0.3 
2. 2312.1 24 Sef Oe Pas 
135733;2.1 13:34) +13, 39/22. 0 
10.4 | 2.4[xf17) = 7.25 10.1 10.1 | 2.3{x717, = 8.76 
—9.1 | 2.1)0.98>P>0.95 | —8.5| -—8.6 | 2.0)0.95>P>0.90 
3 —0.3 | 2.4 —0.6| —0.6 | 2.3 
—14.6 | 2.10.7 > P >0.5| —14.9 | —15.0 | 2.0/0.9 >P > 9.8 
4 7.0 | 1.5\x?7, = 4.60 reA 7.0 | 1.4\x?7) = 3.64 
—7.2|1.5/0.8>P>0.7| —7.4| —7.4|1.4J/0.9>P>0.8 
5. 17.7 | 3.3\ x21; = 9.20 17.6 | 17.6 | 3.2\x?1o,=°7-81 
9.8| 1.5/0.7 >P>0.5 9.7 0.7 \elek On? Sur Osh 
6. 12.7 | 1.4 12 6x 126 bo 
Ey 6s i es We cer ei Oe Li 2.6 2.7 | M1 dic; = 7.69 
—2.710.9)0.7>P>0.5| —-2.7| -2.8/0.9)0.7>P>0.5 
eee eee ney hi 2 eae ee 
Xizs; = 59.4 Xi7s) = 54.0 
P =0.77 P = 0.90 


Difference between methods: x7,; = 5.4, 0.05 > P > 0.02 


Making horizontal comparisons between the three values of each 
regression few differences are seen and while the difference between 
‘method 3.12 and 3.11 is noticeable, although negligible for practical 
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purposes, there is no sensible difference between the two cycles of 
method 3.12. This illustrates the statement of Fisher (1925) that “in 
approaching the maximum likelihood solution by successive approxl- 
mations --- starting with an inefficient statistic, a single process of 
approximation will in ordinary cases give an efficient statistic differing 
from the maximum likelihood solution, by a quantity which with 
increasing samples decreases as n'.’ A second eycle results in an 
efficient statistic differing from the maximum likelihood solution by a 
quantity in increasing samples, of order n*””. 

The main difficulty with the maximum likelihood solution where 
the number of variables is three or more lies in the determination of 
provisional estimates by graphical means. If the graphical estimates 
are far from the truth many iterations may be necessary. The above 
results indicate that the method 3.12 gives a very good basis for the 
maximum likelihood process and in many practical situations may be 
sufficiently accurate in itself,—the decision, however, will rest with the 
investigator. 


4.0 AND 100% RESPONSES 


The problem of 0 and 100% responses has been widely discussed. 
Transformations whose limits are defined in the interval (— ©, ©) are 
supposed to overcome this difficulty since 0 and 100% response groups 
can be considered to be within the distributions, and are assigned 
expectations. In practice, however, where several of these occur on 
the regression lines, bad fits are found whichever methods are used. 

Where a distribution with a finite range is used as a mathematical 
model two kinds of extreme response may be recognized, one arising 
as a random fluctuation in a sample with an expectation 0 < P < 1 
and one which will always occur outside the finite range. How may we 
distinguish these in practice? Past experience with the experimental 
material, or preliminary pilot experiments, will indicate the mean and 
range of definition and, provided the treatment combinations given 
are in this range any extreme responses may for all practical purposes 
be assumed to be within the finite range of the distribution, can be 
given expectations, and used additively. , : 

In factorial designs, where all combin 
employed, large regions of 0 and 100% 
nothing to the knowledge of effects, are so 
expectations are given to these values, 
tion, or corrections made as in the cas 
the angular transformation, regions of 
the solution of the problem is one of 
the evidence of small comprehensive 


ations of factor levels are 
responses, which contribute 
metimes inevitable. Whether 
e.g. with the probit transforma- 
e of Bartlett’s adjustment with 
them will give bad fits. Clearly 
experimental design based upon 
pilot experiments. 
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4.1. Parallelogram designs. 


Consider an n X m factorial design (Fig. 1). Let the treatment 


AY =] 


° ° 


Expected Region 
of 0% 


n) 


Region of Useful 


Xija der 


Observation Expected Region 
of 100% 


FIGURE 1 
values corresponding to n be X,,(¢ = 1 --- 7) and those corresponding 
to m be X2,(j7 = 1 --- m). If preliminary investigations indicate that 


the areas shown by circles are regions of extreme responses we may 
choose to investigate the levels falling between them in the region of 
useful observation. Thus while all levels of X, are used, only certain 
levels of X, are employed at each of the levels of X, . If the restricted 
levels of X, are displaced in equal steps as we pass along the values of 
X, , the design takes the form of a parallelogram, and the functional 
relationship between the values of X, and X, is linear. These designs 
may be analysed by either the analysis of variance or the regression 
plane technique. 
5. ANALYSIS OF VARIANCE IN PARALLELOGRAM DESIGNS 

The values of X,, and X2, are fixed by the experimental design; 
corresponding to each value of X, , are sets of values of X, , each set 
being of equal size but of different values. These numbers collectively 
form a skew array, and for application of the analysis of variance must 
be transformed into a rectangular array. 

For treatment values situated on a line parallel to the sloping side 
of the parallelogram in the figure, X, and X, are connected by a relation 


C.X, + CX, = constant = Xf say. 
Let us suppose that the intervals between consecutive values on this 
line are h, and h, respectively for X, and X, . 
If X#* can be chosen so that X} = X, when X, = 0, 
then C, = 1 and C\h, — C2hz. = 0. 
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h: 
Thus we find C, = - : 
1 
h 
Thee xXp=X,+7 xX. (6) 
zl 


For simplicity the values of Xi, and X:, if equally spaced may be 


coded orthogonally. 
By means of the analysis of variance the constants of the equation 


Y= VY SOX, Px (7) 


may be investigated. 
The true regression of Y on X, is obtained by the substitution of 
(6) in (7), rearrangement and simplification, whence (7) becomes 


VYo=0Y = bX, + 6,X%,4 -:-, “where “OF = bp Cee 


Since the regressions of Y on X, and X% have been estimated inde- 
pendently the variance of b¥ is given by 


Vot = Vb, +. CV bs 


Since 6% and Vb*% are computed from b, and b, it is necessary that 
the system of polynomial coefficients used to code X, ,~X, and X% 
should have identical linear scale intervals, i.e. the same \ for linear 
regression (see Fisher and Yates, 1948). 


5.1. A 4’ example. 


In this experiment the effect of the time between two injections of 
oestrone on the vaginal response has been investigated. Preliminary 
experiments suggested that a log-linear relationship existed between 
the time interval (X,) and dose (X,). The doses of oestrone were 


chosen in the region of useful observation. The plan and results of the 
experiment are shown in Table 6. 


The transformation 
XT = X, + 2X,, where xX} = —3, —1, 1, 8, (\ = 2) 


is applicable in this case. After a full analysis of variance (3.12) was 


made of the transformed data (Table 7), regression coefficients and their 
variances were calculated, viz. . 


NZ a 42.90 at ISX + 9.81(X, + 2X), 


YS 4X 9.81X, + 42.90 
Vb¥ = Vb, + 4Vb, = 5Vb, ; since Vb; = Vb, = 10.84 
Bille | 


| 


N 
roe 
Il 
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TABLE 6 
The effect of the interval of time (X,) between divided doses of oestrone (X,) on the 
percentage vaginal response in ovariectomized mice. 


Time interval between injections (hours) 
Oestrone (10-! ug.) 
0.89(_3) 2.67 (1) 8.9) 24.0:3) 
1¢~) 20 
2-7) 45 
4-5) 15 50 
8-3) 30 70 
16-1) 30 35 
32.1) 55 65 
64,3) 25 60 
1285) 30 70 
256 (7) 60 
512¢3) 85 


A x’ goodness of fit test showed that the model fitted the data (x71, we 
9.20, 0.5 > P > 0.3). 

For the purposes of comparison method 3.11 was also applied to 
the data (Table 8). The following equation was obtained after two 
cycles 

Y = 17.58X, + 9.70X, + 42.92, 


Vie 7 B10 2: > 0, 


TABLE 7 


Analysis of variance of the data of Table 6 following the transformation X,* = 
(X, + 2X,), where X> is the dose of oestrone and X; is the time interval (method 3.12). 


Source of variation D.. Sum of Mean F P 
squares square 
Oestrone (3) (1949.0) 
Linear 1 1922.7 46.9 <0.001 
Quadratic il 3.8 0.9 >0.05 
Cubic 1 22.5 0.5 >0.05 
Time interval 3 260.3 86.8 2.0 >0.05 
Interaction 9 202.5 2275 0.5 >0.05 
Theoretical variance ©0 43.3 


5.2. A 3° example. 


The effect of the time interval (X,), the number of injections in 
this interval (X;) and the dose of oestradiol-3:178 (X.) was investi- 
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TABLE 8 


Analysis of variance of the data of Table 6 following transformation X,* = X2 + 2X), 
where X, is the dose of oestrone and X; is the time interval (method 3.11). 


Source of variation D.f. Sum of Mean F 12 
squares square 
Doses (3) (1910.3) 
Linear 1 1883.7 44.7 <0.001 
Quadratic 1 3.1 0.1 >0.05 
Cubic 1 23.5 0.6 >0.05 
Time interval 3 254.5 84.8 221 >0.05 
Interactions 9 191.0 2152 0.5 >0.05 
Theoretical variance ro) 41.0 


gated with each factor at three levels. The design and results of the 
experiment are shown in Table 9. 


TABLE 9 


The effect of time interval (X,), frequency of injection (X3) and dose of oestradiol- 
3:178(X2) on the percentage vaginal response of ovariectomized mice. 


Time Frequency of Dose of oestradiol-3:178 (107 ug.) 
Interval injection in | |_—A__), —____ 
(hours) time interval y4 63 (0.415) “ 5. 250.585) 10. 50.1. 585) 
16(-1) 2-1) 50.0 66.7 75.0 
400) 83.3 100.0 100.0 
8a) 58.3 66.7 83.3 
1 75-3) 3.50.0) 7.001) 
24,0) 2 50.0 75.0 75.0 
4 66.7 66.7 100.0 
8 41.7 75.0 83.3 


36.1) 2 16.7 41.7 58.3 
4 33.3 91.7 83.3 
8 41.7 50.0 83.3 


Number of animals per group = 12 


ie F hee 
The coefficient (X2p) as subscript is the logarithm to the base 2 of the dose given, 


fined by and is de- 


Dose = 3.50 X 2X25, 
©.g. 2.63 = 3.50 K 270-45, 
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By means of the transformation 
AF = X, + 0.585X, , where Xt = —1,0, 1, (A. == 1)s 


TABLE 10 
Analysis of variance of the data of Table 9 following the transformation X2* = 
X_ + 0.585X, , where X; is the dose of oestradiol-3:178 and X, is the time interval 
(method 3.12). 


Source of variation Dts Sum of Mean F P 
squares square 

Oestradiol-3:178 (2) (2195.5) 

Linear Hf 2146.0 28.5 <0.001 

Quadratic 1 49.5 0.7 >0.05 
Time interval (2) (834.8) 

Linear 1 796.5 10.6 0.01-0.001 

Quadratic 1 38.3 0.5 >0.05 
Frequency (2) (1414.3) 

Linear 1 132.0 1.8 >0.05 

Quadratic 1 1282.3 17.0 <0.001 
Interactions 20 1025.3 51.3 0.7 >0.05 
Theoretical variance © 18,3 


an analysis of variance (3.12) was made (Table 10) and regression 
coefficients were calculated, viz. 


Y — 56.34 = —6.65X, + 10.92(X, + 0.585X,) — 4.87(3X; — 2),t 
or ¥ = 66.08. — 0.27X, + 10.92X, —.14.62X7 (8) 
In order to test the significance of b% its variance was computed: 
Vb? = 5.62 


It is obvious that b* is not significant, showing that the adoption 
of this design was unnecessary. The design was based on evidence 
from experiments with the related compound oestrone, and the differ- 
ence in behaviour of oestradiol-3:176 has since been confirmed (Biggers 
and Claringbold, 1954). 

The goodness of fit test for equation (8) showed that the model was 
satisfactory (x? i181 I 14.7, 0.7>P> 0.5). ca 

Equation (8) was used to compute provisional estimates for an 


ee poe 
+The £ functions (Fisher and Yates, 1948) have been employed for the calculation of the regression 
coefficient of the quadratic term, +h 
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Analysis of variance of 


0.585, , where X is dose of oestradiol-3:178 


TABLE 11 
the data of Table 9 following transformation X_* = X2 + 
and X, the time interval (method Seip). 
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Source of variation DH, Sum of Mean F IE 
squares square 

Oestradiol-3:178 (2) (2199.1) 

Linear 1 2171.4 31.8 <0.001 

Quadratic 1 ileal 0.4 >0.05 
Time interval (2) (926.1) 

Linear 1 887.6 13.0 <0.001 

Quadratic 1 38.5 5.6 >0.05 
Frequency (2) (1479.4) 

Linear 1 126.4 1.8 >0.05 

Quadratic 1 1353.0 19.8 <0.001 
Interaction 20 1024.0 51.2 0.8 >0.05 
Theoretical variance co 68.4 


exact iterative solution by method 3.11 (Table 11). 
the following equation was obtained, 


Y = 66.36 — 0.60X, + 10.98X, — 15.02X2 
fee IV Ie ie 


After two cycles 


6. CONCLUSIONS 


The comparisons which have been made in Sections 3 and 5 demon- 
strate that the two methods give similar answers. The advantages of 
the maximum likelihood solution lie in versatility (e.g. n need not be 
greater than 9 and need not be constant) and accuracy. The need for 
rapid methods of computation in routine and research laboratories has 
led to the development of many short-cut, but inefficient, approximate 
methods (for discussion see Berkson, 1950; Biggers, 1951; Finney 
1952b), none of which have been extended to analyse multifactor aa 
signs. The results of the present paper show the ease and speed of 
solution by the analysis of variance of the observed transformed response 
provided the data are orthogonal. Furthermore, if the investigator 
requires an exact solution this analysis forms an excellent starting point 
for the iterative process. 
: The analysis of variance has enormous advantages over the regres- 
sion methods in multifactor designs. First, tests of significance of 
treatment differences are all that are usually required and not the fittin 
of constants. Secondly, when four or more regression coefficients = 
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to be fitted the iterative procedures other than with the angular trans- 
formation are formidable (Finney, 1952a). 

In the second part of the paper it has been shown that the problem 
of extreme responses may be overcome by the choice of suitable designs. 
This seems preferable to the selection of transformations defined in the 
infinite interval and all of which lead to unequal weight per transformed 
observation, thus making the computational procedures laborious. 
In this regard the angular transformation has considerable advantages 
over both the probit and logit transformations. 

The designs described may be modified to many types of region of 
useful observation. They will be determined by the relationship between 
this region and the independent variables. For example, if a severe 
quadratic effect is predicted a transformation of the form 


ae — Ae Se C,X4 + Coxe 


would be suitable. 

The use of these designs in the case of a continuous dependent 
variable can be imagined where the variance of an observation depends 
on the level of response. Also these designs may be used where certain 
regions are of no interest or can be neglected in the interests of economy 
of time or material. 

We wish to express our thanks to Sir Ronald Fisher, F.R.S., Dr. 
J. O. Irwin and Miss H. Newton Turner for helpful criticism of the 
manuscript. Also our thanks are due to Mr. T. Nalukowyj for his 


translation of the paper by Bliss (1937). 
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THE TRUNCATED POISSON DISTRIBUTION 


R. L. Prackerr 


University of Liverpool 


1. The Poisson distribution is defined by 
P, = Ne /r! ir = Ol e2s 2) 


Examples have arisen in which this distribution is truncated because 
no observations are available (i) for r = 0, or (ii) for r greater than 
some specified value s. The truncated distribution (i), defined by 


PoaiNe ili) (ree oO eg) 


has been considered by David and Johnson in a recent issue of this 
journal [1]. In particular, they derive the maximum likelihood estimate 
of \ and its asymptotic variance, and discuss the efficiency of estimation 
by moments. Distribution (ii) has been studied by Moore [2], who 
provides a simple estimate of \ and shows the effect of truncation at 
different values s. The purpose of the present note is to provide a 
similar estimate of for distribution (i), to show that it is highly efficient, 
and to estimate its sampling variance. 

2. Suppose that a sample of size N from a truncated Poisson dis- 
tribution of type (i) gives NV, observations equal to r(r = 1, 2, 3, ---). 
To estimate an arbitrary function 6(\), use quantities of the form 


6* = >> 2,N,/N, 
r=1 


and evaluate the unknowns x, by the requirement that 6* 1s to be un- 
biased. Thus 


= are /rid —e*) = ar), 


r=1 


and z, is obtained as the coefficient. of \’/r! when (e* — 1)@(A) is ex- 


panded in powers of X. 
3. First, consider the estimation of \. If 


Se eee a eeuese 


r=] 
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a comparison of coefficients gives 
Dye =O ly tees Oe ry 22) 


The desired estimate is therefore 


foo) 


AX = Do rN,/N. 
r=2 
Example. <A factory employs about ten thousand workers but the 
exact number fluctuates with the amount of work available. During 
a certain period, the number of workers N, having r accidents is as 
follows 


If these data are regarded as a sample of 2390 from distribution (i), then 
A* = 747/2390 = 0.3126 
On the other hand, the maximum likelihood estimate, %, is the solution 


of the equation 


foo} 


> TN,/N = 4/01 — &5 


r=1 


The left side is 1.1657 giving § = 0.3149. 

4. The estimate \* can be regarded as the mean value of a sample 
of size N from a probability distribution with probability P, for the 
value 0 (observed N, times) , P, for the value 2 (observed N, times), 


P3 for the value 3 (observed N, times), --- If the variance of this 
probability distribution is 0”, 


Var. NS a /N 
and, in fact, 


Var. * = [\ + 7/¢. — DI/N. 
For the maximum likelihood estimate 
Var. ~ XL — & P/N — 6 — he) 


This is an asymptotic result, whereas the expression for- Var. \* is 
exact. Some numerical values for the two variances are given in the 
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following table, which also permits an assessment of the loss of informa- 
tion due to truncation (by comparing the second column with the third) 


TABLE 1. 
r N Var. X N Var. d* Efficiency 
(asymptotic) (exact) of X* 
0.5 0.8582 0.8854 0.9693 
£0 1.5122 1.5820 0.9559 
£25 2.0474 2.1462 0.9539 
2.0 2.5173 2.6261 0.9586 
2.5 2.9555 3.0589 0.9662 
3.0 3.3823 3.4716 0.9743 
3:5 3.8095 3.8814 0.9815 
4.0 4.2434 4.2985 0.9872 


The efficiency of \* is readily computed from the formula 


Waray vars) = (1 — eri e | ne }); 


ie 


it tends to 1 as X tends to zero or infinity, and never falls below 0.9536, 

the minimum value being attained when } = 1.355. In view of the 

ease with which \* can be calculated, there are good reasons for using it. 
5. Second, consider the estimation of Var. A*. If 


> y,d’/r! = (e — 1) Var. \* = AE — 1) +, 
=F 
then y,; = 0, y2 = 4, Ys = 3, °°, Y% =T(r = 3) 
An unbiased estimate of Var. \* is therefore 
(Var. \*)* = (NA* + 2N,)/N’. 
In the example given above, 
(Var. \*) * = (747 + 624)/2390* = 0.0002400, 


corresponding to a standard error of 0.0155. This estimate of the 
variance of \* may be compared with the value 0.0002422 obtained by 
inserting \ = \* in the exact formula. Proceeding as before, 


Var. (Var: *)* = [\ + (70? — 20°)/@ — 1) — v°/@ — D°1/N*, 


whereas the variance of the quantity obtained by inserting \* for 
in the expression for Var. * is 
Var. (Var.* A*) ~ [1 + 24/@ — 1) - ne /(es — 1)? Var. \*/N? 
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Both variances are asymptotically equal to 8\/N* as d tends to zero, 
and to \/N* as ) tends to infinity. Some other values are given below, 
and there appears to be some justification for using (Var. \*)* provided 
that \ is below 0.5 or above 4.0. Since the omission of the zero group 
may reasonably arise from the small value of \, a rapid estimate of 
Var. \* is worth mentioning. 


TABLE 2. 
» N® Var. (Var.*\*) | N® Var. (Var. \*)* Ratio 
(asymptotic) (exact) 

0.5 2.1604 2.6637 .8110 
1.0 2.4453 3.5712 .6847 
1.5 2.2761 3.6673 .6206 
2.0 2.1366 3.4862 .6129 
2.5 2.1493 3.3054 .6502 
3.0 2.3235 3.2492 .7151 
3.5 2.6396 3.3545 . 7869 
4.0 3.0484 3.6033 . 8460 


6. There is, in general, no reason to suppose that \ is the quantity 
to which special interest attaches, and the same method can be used 
to estimate any function of \ and its sampling variance, provided that 
these quantities can be expanded in power series. An investigation 
of the efficiency of the resulting estimate should always be made, 
however, for estimates as efficient as \* may be the exception rather 
than the rule. Thus, an unbiased estimate of e* is found to be 


()* = (Mi + Nat Net] — (Me + Net Not --D/N 


but its efficiency, compared with e”*, is as low as 0.4994 when \ = 0.5 
and rapidly decreases to zero with increasing }. 
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TABLES OF PEARSON-LEE-FISHER FUNCTIONS OF SINGLY 
TRUNCATED NORMAL DISTRIBUTIONS* 


A. C. CoHEN, JR. AND JoHN WoopwarpD 


The University of Georgia 


Singly truncated normal distributions are encountered in numerous 
scientific investigations and with such frequency that it seems highly 
desirable to provide tables or other computational aids for lessening the 
labor involved in computing estimates of population parameters. In a 
distribution of the type considered here, the point of truncation, 2 , 
is assumed known and measurements are possible only for x’ > 2% . 
No record is available either of count or measurement for x’ < 2. 
Although Pearson-Lee-Fisher estimators [1, 2, 3] for such distributions 
were first given by Pearson and Lee in 1908, tables adequate for routine 
estimation without considerable computational effort have not previously 
been available. One of the present authors [4] derived equations in 
1949 which permit calculation of P.L.F. estimates by a simple iterative 
process with the aid of ordinary tables of normal curve areas and ordi- 
nates. Thereby dependence on special tables was eliminated, but the 
problem of effectively reducing the labor incident to routine analyses 
of large numbers of samples from these distributions remained. Using 
the equations mentioned above, the present tables were prepared in 
an effort to alleviate this latter difficulty. 

The frequency function of a distribution of the type under considera- 


tion may be written as 
fe 2 
Se ae Se OF 


peeves felis, i= 
Ee te Ja 1 Vane exp { 2 


where m and o are respectively the population mean and standard 
deviation, & is the truncation point in standard units of the complete 


*Sponsored by the Office of Ordnance Research, U. S. Army, under contract DA-01-009-ORD-288. 


489 


BIOMETRICS, DECEMBER 1953 


490 
TABLE I: THE FUNCTIONS 1/(Z — £) AND m2/2m? 

t AOA _» £) My /2m,? é 1/(Z = é) me /2my" 
4.0 | 0.2499 9164 | 0.5312 3118 —2.25 | 0.4381 8666 | 0.5889 6377 
3.9 | .2563 9720] .5328 4429 ~2.24| 4399 7185 | .5895 5609 
3.8 | .2631 3768] .5345 8231 2.93} 4417 6893 | .5901 5225 
3.7 | .2702 3924] .5364 5722 2.22 | .4435 7797 | .5907 5225 
3.6 | .2777 3056 | .5384 8215 —2.21| .4453 9905 | .5913 5610 
—3.5 | 0.2856 4305 | 0.5406 7131 ~2.20 | 0.4472 3224 | 0.5919 6380 
—3.4 | .2940 1106] .5430 4005 —2.19| .4490 7762 | .5925 7535 
—3.3 | .3028 7213 | .5456 0478 —2.18 | .4509 3528 | .5931 9077 
—3.2 | .3122 6719 | .5483 8291 2.17] .4528 0528} .5938 1004 
—3.1 | .3222 4074| .5513 9269 —2.16| .4546 8771 | .5944 3318 
—3.0 | 0.3828 4097 | 0.5546 5301 —2.15 | 0.4565 8266 | 0.5950 6023 
—2.9 | .3441 1993 .5581 8315 —2.14| .4584 9014 | .5956 9105 
~2.8 | .3561 3351] .5620 0245 2.13 | .4604 1030 | .5963 2579 
~2.7 | .3689 4145 | .5661 2985 —2.12| .4623 4320 | .5969 6440 
—2.6 | .3826 0720] .5705 8349 —2.11 | .4642 8890 | .5976 0689 
~2.50 | 0.3971 9772 | 0.5753 8016 —2.10 | 0.4662 4750 {0.5982 5324 
—2.49 | .3987 1031 | .5758 7929 ~2.09 | .4682 1906 | 5989 0346 
—2.48 | 4002 3291] .5763 8201 —2.08 | .4702 0366 | .5995 5755 
—2.47| .4017 6561 | .5768 8833 —2.07 | .4722 0139 | .6002 1551 
—2.46 | .4033 0847 | .5773 9828 ~2.06| .4742 1231 | .6008 7733 
—2.45 | 0.4048 6157 | 0.5779 1186 —2.05 | 0.4762 3651 | 0.6015 4302 
—2.44| 4064 2497 | .5784 2909 —2.04| .4782 7405 | .6022 1257 
2.43 | .4079 9876 | 5789 4998 —2.03 | .4803 2503 | .6028 8598 
—2.42} 4095 8209] 5794 7454 —2.02 | .4823 3952 | .6035 6324 
—2.41| .4111 7776 | .5800 0278 —2.01| .4844 6759 | .6042 4435 
~2.40 | 0.4127 8313 | 0.5805 3471 —2.00 | 0.4865 5932 | 0.6049 2930 
—2.39} .4143 9917 | 5810 7035 —1.99 | .4886 6479 | 6056 1810 
—2.38 | .4160 2597 | 5816 0970 —1.98 | .4907 8407 | .6063 1073 
—2.37| .4176 6359 | 5821 5978 —1.97 | .4929 1724 6070 0719 
—2.36 | .4193 1211 | .5826 9960 —1.96 | .4950 6438 | .6077 0746 
kt 0.4209 7160 | 0.5832 5017 —1.95 | 0.4972 2557 | 0.6084 1156 

.4226 4214 | 5838 0449 ~1.94| 499 
ore .4994 0087 | .6091 1946 

; .4243 2380 | 5843 6257 ~1.93 | .5015 9020 
ES opie ooe : 0 .6098 3091 

.5849 2443 ~1.92 
Ben) oe ae .5037 9414 | .6105 1164 

.5854 9007 —1.91 | .5060 1226 | .6112 6591 
oe ets pee ee) ony —1.90 | 0.5082 4480 | 0.6119 8895 

: —1.89] .5104 9184 | 612 
nies a ae Batts yi 1.88 | 5127 5346 isis et 
~2. 61 —1.87 | .5150 2972 
—2.26 | .4364 1328 | 5883 7598 Sete 


.6149 1861 


-—-_|__= See | ag 
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1/(Z — 2) 


0.5196 
.5219 
.5242 
.5266 
.5289 


0.5313 


.5337 7 


.5361 
.5386 
.5410 


0.5435 
.5459 
5484 
.5509 
.5535 


0.5560 
5585 
.5611 
.5637 
.5663 


0.5689 
.5715 


.5769 
.5795 


0.5822 
.5850 
.5877 
.5904 
.5932 


0.5960 
.5988 
.6016 
. 6044 
.6073 


0.6102 
6130 
6159 
6189 
6218 


2649 
4714 
8275 
3337 
9908 


1046 
9292 


9194 
0757 
4333 
8893 
5479 


3753 
3719 
5385 
8757 
3840 


0640 


0. 
.6361 9784 
.6370 3834 
.6378 7240 
.6387 1436 


i=) 


9165 _ 


9418 
1407 
5138 


Mz/2m,2 


.6156 6035 
.6164 0578 
.6171 5491 
.6179 0771 
.6186 6418 


.6194 2429 
.6201 8803 
.6209 5540 
.6217 2636 
.6225 0090 


.6232 7901 
.6240 6067 
.6248 4581 
.6256 3457 
.6264 2677 


.6272 2244 
.6280 2156 
.6288 2411 
.6296 3008 
.6304 3944 


.6312 5218 
.6320 6823 
.6328 8763 
.6337 1031 
.6345 3628 


6353 6550 


.6395 5945 
.6404 0763 
.6412 5887 
.6421 1316 
.6429 7045 


.6438 3073 
.6446 9396 
.6455 6011 
.6464 2915 


6473 0106 


vt 


Lf(Z i=) 


0.6248 0613 


.6277 7842 
.63807 0521 
.6337 7578 
.6368 0090 


0.6398 4390 
.6429 0912 
.6459 8319 
.6490 7966 
.6521 9408 


0.6553 2650 
.6584 7696 
.6616 4553 
.6648 3225 
.6680 3716 


0.6712 6031 
.6745 0175 
.6777 6152 
.6810 3899 
.6843 3624 


0.6876 5128 
.6909 8483 
.6943 3692 
.6977 0760 
.7010 9692 


0.7045 0490 
.7079 3159 
.7113 7703 
.7148 4125 
.7183 2428 


.7218 2618 
.7253 4667 
.7288 8666 
. 7324 4532 
. 7360 2297 


i=) 


0.7396 1964 
. 7432 3536 
. 7468 7016 
.7505 2406 
.7541 9711 


Me /2m,? 


.6481 7579 
.6490 5333 
.6498 4875 
.6508 1667 
.6517 0233 


.6525 9083 
.6534 8791 
.6543 7554 
.6552 7177 
.6561 7053 


6570 7178 
.6579 7552 
.6588 8168 
.6597 8524 
.6607 0116 


.6616 1440 
.6625 2993 
.6634 4771 
.6643 6681 
.6652 8477 


.6662 1419 
.6671 4061 
.6680 6397 
.6689 9959 
.6699 3208 


.6708 6652 
.6718 0286 
.6727 4108 
.6736 8113 
.6746 2297 


.6755 6656 
.6765 1150 
.6774 5885 
.6784 0746 
.6793 5765 


.6803 0941 
.6812 6267 
.6822 1740 
.6831 7356 
.6841 3110 
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TABLE I: THE FUNCTIONS 1/(Z — £) AND m2/2mi12—Continued 


t 1/(Z — 8) |. me/2m? : 1/(Z — &) ma 2m, 
—1.05 | 0.7578 8932 | 0.6850 9000 —0.65 | 0.9215 0402 | 0.7240 7363 
—1.04| .7616 0140 | .6860 5107 —0.64 | .9259 9527] .7250 5211 
—1.03 | .7653 31383 | .6870 1166 —0.63 | .9305 0610 | .7260 3022 
—1.02 | .7690 8119 | .6879 7434 —0.62 | .9350 3648 | .7270 0792 
—1.01 | .7728 5081 | .6889 3821 —0.61 | .9395 8641 | .7279 8517 
—1.00 | 0.7766 3873 | 0.6899 0322 —0.60 | 0.9441 5588 | 0.7289 6193 
—0.99 | .7804 4645 | .6908 6932 —0.59 | .9487 4489 | .7299 3817 
—0.98 | .7842 7351 | .6918 3648 —0.58 | .9533 5341 | .7309 1385 
—0.97 | .7881 1992 | .6928 0466 —0.57 | .9579 8144 | .7318 8892 
—0.96 | .7919 8570 | .6937 7381 —0.56 | .9626 2896 | .7328 6336 
—0.95 | 0.7958 6454 | 0.6947 3584 —0.55 | 0.9672 9606 | 0.7338 3725 
—0.94| .7997 7546 | .6957 1486 —0.54| .9719 8244 | .7348 1019 
—0.93 | .8036 9947 | .6966 8668 —0.53 | .9766 8837 | .7357 8250 
—0.92 | .8076 4293 | .6976 59380 —0.52| .9814 1373 | .7367 5403 
—0.91 | .8116 0585 | .6986 3269 —0.51 |} .9861 5852 | .7377 2474 
—0.90 | 0.8155 8824 | 0.6996 0680 —0.50 | 0.9909 2272 | 0.7386 9460 
—0.89 | .8195 9013 | .7005 8159 —0.49 | 0.9957 0531 | ~7396 6233 
—0.88 | .8236 1151 | .7015 5703 —0.48 | 1.0005 0925 | .7406 3160 
—0.87 | .8276 5241 | .7025 3306 —0.47 | 1.0053 3156 | .7415 9869 
—0.86 | .8317 1284] .7035 0964 —0.46 | 1.0101 7320 | .7425 6478 
—0.85 | 0.8357 9280 | 0.7044 8674 —0.45 | 1.0150 3414 | 0.7435 2983 
—0.84 | .8398 9231 | .7054 6832 —0.44 | 1.0199 1488 | .7444 9383 
—0.83 | .8440 1138 | .7064 4232 —0.43 | 1.0248 1389 | .7454 5674 
—0.82 | .8481 4675 | .7074 1663 —0.42 | 1.0297 3264 | .7464 1851 
—0.81 | .8523 0821 | .7083 9946 —0.41 | 1.0346 7062 | .7473 7912 
—0.80 | 0.8564 8599 | 0.7093 7852 —0.40 | 1.0396 2780 | 0.7483 3854 
—0.79 | .8606 8335 | .7103 5784 —0.39 | 1.0446 2089 | .7493 1748 
eos .8649 0030 | .7113 3738 —0.38 | 1.0495 9966 | .7502 5366 

.8691 3686 | .7123 1708 —0.37 | 1.0546 1429 | .7512 0929 
—0.76 | .8733 9299 | .7132 9699 —0.36 | 1.0596 4802 | .7521 6361 
gue os 5898 | 0.7142 6476 —0.35 | 1.0647 2041 | 0.7531 4085 
eae 19 6407 | .7152 5701 —0.34 | 1.0697 7268 | .7540 6815 

.8862 7901 | .7162 3708 —0.33 | 1.0748 6355 | .7550 1831 
—0.72| .8906 1355 | .7172 1713 ~0.32 | 1.0799 7340 | .7559 6702 
—0.71| .8949 6769] .7181 9712 —0.31 | 1.0851 0222 | .7569 1426 
Bech. mite oa 0.7191 7701 —0.30 | 1.0902 4996 | 0.7578 5998 
eo ele .7201 5676 —0.29 | 1.0954 1661 | .7588 0418 

: .9081 4770 | .7211 3635 —0.28 | 1.1006 0212 | .7597 4660 
—0.67 | .9125 8023 | .7221 1571 —0.27 | 1.1058 0647 | _7606 8785 
0.66 | .9170 3233] .7230 9481 0.26 | 1.1110 20962 | | 


-7616 2726 


TRUNCATED NORMAL DISTRIBUTION 493 


TABLE I: THE FUNCTIONS 1/(Z — £) AND m2/2m:*—Continued 


iS 1/(Z—€) | m,/2m;? g 1/(Z — £) Mes/2m2 
—0.25 | 1.1162 7155 | 0.7625 6503 0.16 | 1.3468 8124 | 0.7992 9404 
—Q0.24 | 1.1215 2100 . 7634 8720 0.17 | 1.3528 7561 .8001 4178 
—0.23 | 1.1268 1158 . 7644 3550 0.18 | 1.3588 8690 .8009 8698 
—0.22 | 1.1821 0962 .7653 6815 0.19 | 1.3649 1507 .8018 2964 
—0.21 | 1.1874 2629 .7662 9904 0.20 | 1.3709 6007 .8026 6975 
—0.20 | 1.1427 6157 | 0.7672 2816 0.21 | 1.3770 2184 | 0.8035 0728 
—0.19 | 1.1481 1541 .7681 5546 0.22 | 1.3831 0034 . 80438 4224 
—0.18 | 1.1534 8776 .7690 8090 0.23 | 1.38891 9551 .8051 7460 
—0.17 | 1.1588 7863 |- .7700 0452 0.24 | 1.3953 0731 .8060 0436 
—0.16 | 1.1642 8794 .7709 2624 0.25 | 1.4014 3567 .8068 3151 
—Q.15 | 1.1697 1567 | 0.7718 4605 0.26 | 1.4075 8055 | 0.8076 5603 
—0.14 | 1.1751 6177 .7727 6392 0-27-| 1.4137 4189 .8084 7792 
—0.13 | 1.1806 2621 .7736 7983 0.28 | 1.4199 1965 .8092 9715 
—0.12 | 1.1861 0895 .7745 9376 0.29 | 1.4261 1376 .8101 1374 
—0.11 | 1.1916 0995 . 7755. 0568 0.30-| 1.4823 2418 .8109 2765 
—0.10 | 1.1971 2917 | 0.7764 1558 0.31 | 1.4885 5085 | 0.8117 3889 
—0.09 | 1.2026 6656 .7773 2342 0.32 | 1.4447 9372 .8125 4745 
—0.08 | 1.2082 2209 .7782 2919 0.33 | 1.4510 5273 .8133 5331 
—0.07 | 1.2137 9571 .7791 3286 0.34 | 1.4573 2783 .8141 5648 
—0.06 | 1.2193 8739 . 7800 3443 0.35 | 1.4686 1897 .8149 5693 
—0.05 | 1.2249 8601 | 0.7809 2001 0.36 | 1.4699 2609 | 0.8157 5466 
—0.04 | 1.2306 2473 .7818 3111 0.37 | 1.4762 4914 .8165 4967 
—0.03 | 1.2362 7486 .7827 3189 0.38 | 1.4825 8806 .8173 4195 
—0.02 | 1.2419 3376 . 7836 1907 0.39 | 1.4889 4280 .8181 3149 
—0.01 | 1.2476 1505 .7845 0973 0.40 | 1.4953 1330 .8189 1828 
0.00 | 1.2533 1414 | 0.7853 9816 0.41 | 1.5016 9951 | 0.8197 0231 
; 0.42 | 1.5081 0138 .8204 8359 
0.01 | 1.2590 5667 | 0.7863 1656 0.43 | 1.5145 1884 .8212 6211 
0.02 | 1.2647 6550 .7871 6822 0.44 | 1.5209 5184 .8220 3785 
0.03 | 1.2705 1768 .7880 4982 0.45 | 1.5274 0033 .8228 1081 

0.04 | 1.2762 8747 | .7889 2911 
0.05 | 1.2820 8904 . 7898 2392 0.46 | 1.5339 3484 | 0.8236 7304 
0.47 | 1.5403 4356 .8243 4840 
0.06 | 1.2878 7970 | 0.7906 8067 0.48 | 1.5468 3817 | .8251 13801 
0.07 | 1.2937 0204 | .7915 5291 0.49 | 1.5533 4806 | .8258 7482 
0. 1 


.5598 7315 | .8266 3383 


1 
1 

0.08 | 1.2995 4180 | .7924 2377 
1.3053 9893 . 7932 9024 
1 


6259 4816 | 0.8340 6925 
.6934 8199 | .8412 2194 
.7624 1805 | .8480 9147 
.8326 9980 | .8546 7938 
1.9042 7123 | .8609 889 


.3112 7339 | .7941 5529 


Bee ee 


0.12 | 1.3280 7409 | .7958 7808 
0.13 | 1.3290 0024 | .7967 3580 
0.14 | 1.3349 4351 .7975 9104 
0.15 | 1.3409 0386 | .7984 4379 


ee 


0.6 
0.7 
0.11 | 1.3171 6513 | 0.7950 1791 0.8 
OF9 
1.0 
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TABLE I: THE FUNCTIONS 1/(Z —£) AND m2/2m2—Continued 


é 1/(Z — &) Me2/2my" E 1/(Z — é) me/2m? 
1.1. | 1.9770 7708 | 0.8670 245 2.1 | 2.7618 362. | 0:91389 42 
1.2 | 2.0510 6317 .8727 922 2.2 | 2.8449 813 .9174 80 
1.3 | 2.1261 7654 .8782 986 2.3 | 2.9288 166 .9208 44 
1.4 | 2.2023 6569 .8835 513 2.4 | 3.0133 080 .9240 43 
1.5 | 2.2795 8069 .8885 585 2.5 | 3.0984 233 .9270 84 


1.6 | 2.3577 7330 | 0.8933 288 2.6 | 3.1841 318 | 0.9299 77 
1.7 | 2.4368 9703 .8978 711 2.7 | 3.2704 044 .9327 27 
1.8 | 2.5169 0714 .9021 943 2.8 | 3.3572 134 .9353 42 
1.9 | 2.5977 6079 .9063 078 2.9 | 3.4445 327 .9378 30 
2.0 | 2.6794 1689 .9102 21 3.0 | 3.5323 375 .9401 99 


distribution, or more precisely = (xj — m)/c, and J,(€) is the fraction: 
of the complete distribution retained after truncation; i.e. I)(é) = 
S= ¢(é) dt, where g(t) = (1/-V 2) exp (—#?/2). The P.L.F. estimating 
equations are written as 


(2) 22, 1/1] 1 i] 
Sra er to Flee wen 
eae Feel 
oe 
(4) m = x, — €, 


where >, x and >> 2” are respectively sums of the first and second powers 
of sample measurements about the truncation point (c = 2’ — 2) 
and n is the number of sample observations = 2. Zisa function of £ 
defined as Z(t) = ¢(£)/I,(é). The estimates obtained on solving these 
equations are maximum likelihood as well as moment estimates and 
accordingly are distinguished from the parameters estimated by the 
symbol (*). The notation employed here is essentially that of [4], and 
it differs in Some respects from the notations of [1, 2 3]. 

In practice, n De 2° / ODE) is calculated frous aN sample data and 
equation (2) is solved for £ which on substitution into (3) yields ¢ 
Equation (4) is then employed to determine 7. wa 


To facilitate solution of equations (2) 
and (3), th i 
tables of 1/(Z — £) and of M2/2m; , where WOR wag: 


mai = 3 1 ] 1 
2m). 2LZ—¢ Fi *| 
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TABLE II: THE FUNCTIONS W’(§), w’(&) AND Pot 


E Ww’ w’ p £ Ww w! p 
—3.0) 0.536 283 | 5.986 069 | 0.9114 —0.5| 2.489 898 12.569 370 | 0.9145 
—2.9| 0.545 333 | 5.786 631 | 0.9078 —0.4| 2.692 714 |13.973 826 | 0.9188 
—2.8) 0.556 167 | 5.610 046 | 0.9043 —0.3) 2.975 044 |15.600 726 | 0.9232 
—2.7| 0.569 038 | 5.457 103 | 0.9008 —0.2) 3.289 968 |17.484 486 | 0.9275 
—2.6) 0.583 808 | 5.323 872 | 0.8974 —0.1) 3.640 849 |19.664 939 | 0.9316 
—2.5| 0.602 029 | 5.225 319 | 0.8943 0.0) 4.031 257 |22.187 540 | 0.9359 
—2.4| 0.622 786 | 5.148 179 | 0.8914 
—2.3) 0.646 862 | 5.098 163 | 0.8887 0.1) 4.465 167 |25.105 193 | 0.9400 
—2.2| 0.674 663 | 5.076 429 | 0.8864 0.2) 4.946 791 |28.478 115 | 0.94389 
—2.1| 0.706 637 | 5.084 354 | 0.8845 0.3) 5.480 680 |32.375 335 | 0.9477 

| 0.4) 6.071 728 |36.875 669 | 0.9513 
—2.0| 0.743 283 | 5.123 602 | 0.8831 0.5) 6.725 176 |42.068 884 | 0.9547 
—1.9| 0.785 157 | 5.196 188 | 0.8821 
—1.8) 0.832 880 | 5.304 564 | 0.8816 0.6) 7.446 632 |48.056 996 | 0.9579 
—1.7| 0.887 141 | 5.451 691 | 0.8815 0.7| 8.242 087 |54.995 696 | 0.9610 
—1.6| 0.948 713 | 5.641 149 | 0.8820 0.8) 9.117 921 |62.895 815 | 0.9639 

| 0.9/10.080 921 |72.025 044 | 0.9666 
—1.5} 1.018 458 | 5.877 237 | 0.8830 1.0/11.138 290 |82.509 610 | 0.9691 
—1.4| 1.097 337 | 6.165 097 | 0.8845 
—1.3} 1.186 420 | 6.510 851 | 0.8865 1.1)12.297 666 | 94.536 317) 0.9714 
—1.2) 1.286 897 | 6.921 577 | 0.8889 1.2)13.567 116 |108.314 380) 0.9736 
—1.1|} 1.400 090 | 7.406 409 | 0.8917 1.3)14.955 170 |124.077 852) 0.9756 

1.4/16.470 818 |142.087 792) 0.9775 
—1.0} 1.527 464 | 7.974 909 | 0.8949 1.5/18.123 532 |162.634 894) 0.9792 
—0.9| 1.670 639 | 8.639 143 | 0.8984 
—0Q.8} 1.831 403 | 9.418 034 | 0.9022 1.6/19.923 245 |186.041 823) 0.9808 
—0.7|} 2.011 724 |10.312 859 | 0.9061 1.7/21.880 362 |212.665 989] 0.9823 
—0.6) 2.213 765 |11.357 604 | 0.9103 1.8/24.005 980 |242.904 606} 0.9837 
1.9/26.311 135 |277.190 113) 0.9849 
2.0/28.808 163 |316.007 189] 0.9861 


The assistance of Mr. Walter Lynch in performing many of the computations in- 
volved in preparing these tables is gratefully acknowledged. 


were compiled as functions of £ using a standard interval of 0.01. For 
values of & less than —2.5 and for those greater than 0.5, the interval 
is 0.1. Most entries are given to 8 decimals. The National Bureau of 
Standards W.P.A. Tables of Normal Curve Areas and Ordinates [5] 
served as a basis for the calculations. 

Weighting factors W’ and w’, obtained from the variance-covariance 
matrix for use in determining variances of ¢ and of ~ were computed 
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at intervals of 0.1 on ~ Formulas used in these calculations are the 
following, which are given in [4]. 
; 1— AZ — &) 3 
W® = 7-7 — pe -«Z —-Hl- —- 7 
w® = To 7@— blk - @—Ol- ZF 
To estimate the required variances, one has only to read W’ and w’ 
corresponding to — = € from Table II and evaluate 


n 


v= Ww, and V@ = 


The coefficient of correlation p,,; between sampling errors of ¢ and é 
also obtained from the variance-covariance matrix, has been tabulated 
to 4 decimals using the relation (cf. [4]) 
3 Za 

Vil 24 —pi2 = hs =p} 


Pot 


Designating the argument as 7 rather than £, Sampford [6] recently 
published tables of Z, Z(Z — &), and Z — £Z(Z — £) which in his 
notation are designated », \ and ¢ respectively. Although useful for 
the purposes intended, Sampford’s tables fail to meet the need which 
prompted preparation of the present tables. The most useful tables 
previously available for the present purposes are those of Hald [7]. 
He gives £ as a function of m,/2m_ at intervals of 0.001 but only to 
three decimals. He also tabulated 1/(Z — &) at intervals of 0.1 and 
gives elements of the variance-covariance matrix and the correlation 
coefficient p,,, at the same interval. He does not, however, give the 
variance weighting factors W’ and w’ explicitly. Hald’s notation is 
different from that employed by Sampford as well as from that used here. 
He designates m,/2m; as y, £ as 2, and 1/ (Z — &) as g(z). It is also 
noted that Hald’s correlation coefficient pz,m relates to sampling errors 


between é and mm, whereas the coefficient tabulated here relates to 
similar errors between ¢ and £. ‘ 


An Illustrative Example. To demonstrate the practical use of these 


tables, it is convenient to choose an example previously employed to 


illustrate the iterative procedure of reference [4]. For this example, 
n = 37, >. ¢ = 51.8600, >> 2? = 98.0156, and z/ = 0.850. Accordingly, 
we compute n 7 27/2()° 2)? = 0.67422043, Entering Table I with 
m,/2m; = 0.67422043, we immediately read ¢ = —1.16, which is 
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correct to the two decimals given. For greater accuracy, we interpolate 
linearly as summarized: 


é My /2my2 
—1.1600 0.67462297 
—1.1643 0.67422043 
=e LOO 0.67368113 
Thereby, after very little effort we obtain € = —1.1643 which is in 


exact agreement with the more laboriously obtained result of [4]. Direct 
pe noiation gives 1/(Z — £) = 0.7168266, and from equation (3), 
é = (51.8600/37)(0.7168266) = 1.0047. From (4), * = 0.850 — 
(1.0047) (—1.1643) = 2.020, and thus all estimates calculated here 
agree with the values obtained in [4]. To estimate variances of these 
estimates, we interpolate in Table IT to read W’(—1.1648) = 1.337307, 
and w’(—1.1643) = 7.094662. Using these values and eubstarone 
the estimate ¢ = 1.0047 for the unknown a, the variances become 
V(é) = [(1.0047)?/37][1.337307] = 0.0365, and V(£) = 7.094662/37 = 
0.1917. 
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1. Introduction 


This paper summarizes, coordinates and extends a series of researches 
carried out by F. Pimentel Gomes, E. Malavolta, W. L. Stevens and 
I. R. Nogueira concerning the application of Mitscherlich’s law 


y = A[{l — 10°°*”] or y=a+t Bp” 


to the statistical analysis of experiments with fertilizers. The above 
equation was introduced by Eilh. Alfred Mitscherlich [2].in the study 
of fertilization of soils in Germany. Parameter A measures a maximum 
yield which could not be exceeded by the use of the fertilizer in con- 
sideration. Parameter c measures the efficiency of the fertilizer and 
b measures the soil content of the fertilizer in the control plots in a form 
assimilable by the plant. 

Mitscherlich carried out many experiments in Germany, most. of 
them in pots kept in green-houses. The fitting of the curve was ob- 
tained with only two levels of the fertilizer, since a constant known 
value was attributed to c, always the same for each fertilizer. But 
Kletschkowsky and Shelesnow [1] proved that c is not constant, but 
varies with the amounts of other fertilizers. From that time on the 
equation has not been used much, partly because of the apparent 
difficulty of fitting it by satistactory methods. Several authors have 
applied very crude processes of fitting, some of which have tended to 
bring the use of the curve into disrepute. But in the last few years 
advances have been made in the theory, so that the fitting of the curve © 
can now be obtained quickly and accurately in most cases suitable for 


practical application, and efficient estimates may be computed for the 
variances of the estimates of the parameters. 


2. The Estimation of the Parameters 


If in an experiment 2, foe 


* , x, are the 1 
used and y;, , 2, --: , amounts of a fertilizer 


» Yn are the yields obtained, then the estimation _ 
498 : 


iM 


A § 
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of the parameters by the method of least squares should be carried out 
taking in account the function 


w= “> {y 2s A[l bet 10 at} 


of which the derivatives dw/dA, dw/db, dw/dc are to be equated to 
zero. The following equations are then obtained: 

ay nA SAIN > 10. = 0 

(2.1) >, zyl0-* — A >> 210-* + A10-* >> 2107** = 0, 

> pe <a) A 510754 64105°1.97-105%% 0 


Therefore we must have, by Rouché-Capelli’s theorem, 


>S y n ‘> Om. 
(2.2) Sela ie. > 5210.2 = wo) De 10 Sih i0! 
re y10°-* ot 10° > 10° 


This equation, obtained by Pimentel-Gomes and Malavolta [4], 
depends only on parameter c. Its solution, to get an estimate ¢é of c, 
is generally very troublesome. However, we can always fix the amounts 
of fertilizer 2, , v2 , --- , X, aS multiples of a number gq, that is, 


ru; = mq, (ine oll Qt els 1), 
where m; is an integer. Now put 10 “* = z and we obtain 


ae n aE 
(2.3) » xyz” >> 22” ee e=i0: 
ee oar 
As c and q are necessarily positive, we must have 0 < z < 1, that is, 
the only root which should be obtained must lie between zero and one. 
Nogueira [3] proved that equation (2.3) has a zero of order three 
for z = 1 when the levels are supposed to be equally spaced. Therefore 
in this case it can be divided by (2 — 1)*. This division diminishes 
its degree and provides a good check for the computations. The ex- 
pansion of the determinant leads to the equation 


(24) PQ) Dy +P) Lew" + Pe L we” = 0, 
where P,(z), P2(z) and P;(z) are polynomials in z. If we suppose that 


— 
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p different levels, x, , 2, *** » %» , were used and if r is the number of 

replications (pr = ), then equation (2.4) can be written as follows: 
j:[Pi@ + x2"P2@) + 2”'P3@)] 

+ go[Pile) + 222*Polz) + 2”*P3)] 


(2.5) 

+ g,[Pi(@) + «,2"P2@) + 2"’P2@)] = 0, 
where J; , J2, °** » Jp are mean yields of the r replicates corresponding 
to each level. The coefficients of J: , J, °** » Jp in the last equation 


are also polynomials in z, and they can be all divided by (¢ — 1)° and 
by z. When this division is carried out, the following equation is ob- 
tained: 


(2.6) Grd ni) Se God p22) at ei God pv(2) = 0. 


The polynomials J,;(z) (¢ = 1, 2, --- , p) are always the same in 
comparable sets of experiments. They can be tabulated, therefore, 
for suitable values of p and for z between zero and one. The tables 
constructed by Pimentel-Gomes and Nogueira [7] for p = 5 are re- 
produced below, as well as new tables for the case of p = 4. This 
covers most of the cases to be met in practice, since for the case of 

= 3 no tables are necessary and with less than three levels the fitting 
is not possible. 

When the tables are available, the solution of equation (2.6) be- 
comes easy. After obtaining a root z between zero and one, we compute 


(2.7) ie colog z & log (1/z) . 
q q 
If the equation has no root between zero and one, the fitting of the 
curve cannot be carried out. On the other hand, if two or more such 
roots were to be found, it would not be possible to choose among them. 
However, this case has never been observed in the applications. 
Ii we put in (2.8) 7, = % =--- =%, = K, it is easy to see that 


te determinant vanishes identically. Therefore from (2.6) it follows 
that 


J (2) Sie J 52(2) oie tae = J (2) = 0, 
a relation which was useful for checking the tables for these polynomials. 
This identity shows also that if we subtract from the treatment means 


Uv ds ‘* » ¥ &@ constant number K, the remainders obtained satisfy 
also equation (2.6) and can be used to obtain the correct root.. If 
K = % , the first remainder is zero and equation (2.6) is changed to 
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(2.8) (Go — Ji)Jyo(z) + (Ys — Yr)Joa(2) + +++ + (G9 — ji)Jp(2) = 0, 
which is easier to solve. 

After obtaining a root z, from which é is calculated by (2.7), the 
values of A and 6 can be easily obtained from (2.1). For the special 
case of p = 4 we have 


1 eae —-?7-—2)+ 911 —2 —- if 


(2.9) A= C 3 
P,(2) g(l+2-—27)+9,(1 +24+2) 
or 
(2.10) 
: 1 |G -Wa-#-z2)+G — Hl +2 — 2) 
A= Ii + 
P,@) + Gs me g,)(1 oe sol ce Zz’) 


with P,(z) = (1 — z)(8 + 42 + 32’), and 
l+2+24+2 

4 — (1/A)@: + G2 + Gs + H) 

For the case of p = 5 we obtain 


1 Ee —7)+9(1 —z2-2)4+9,(1 — = 


(2.11) b = (1/6) log 


ee Pi eee = pled 
or 
(2.13) 
one ie — pd -—2-2) + @ — 9) - 2’) 
malta P(z) + (Hs a ve! =e Zo 2’) -- Gs ee 7:)(1 oe z’) ‘ 


where P;(z) = 2(1 — z)(2 + 2 + 22’), and 
L+24+27 +2? + 2° 
5 — (1/A)Gi + G2 + Go + Ha + Gs) 
If the usual assumption of homoscedastic normal distribution of 
the y’s is accepted, the method of least squares, which we have used, 


is equivalent to that of maximum likelihood. 
An equation similar to (2.8) can be obtained even with unequally 


spaced levels. 


(2.14) 6 = (1/6) log 


3. The Analysis of Variance 

In the analysis of variance the sum of squares and the degrees of 
freedom corresponding to treatment effects should be split into two 
parts, one attributable to regression by Mitscherlich’s law and the 


ui 
7 
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other to deviations from the regression curve. If we have p levels 
and r replications (pr = n), then the treatments sum of squares 1s 


ir Des CF art a; 
where g;(j = 1, 2, --: , p) are the treatment means, 7 is the grand 


mean and >,’ denotes summation over j. If g; is the yield at the 
jth level, computed by the regression equation, we have 


(3.1) De, Gi raz 9) = ye QG; a3 gi)” Ly a CE a g)” 
+2 G:— 9)G;— 9. 
From equations (2.1) it can be seen that the last term on the right 
of (3.1) vanishes, as shown by Pimentel-Gomes [5], who proved also 


that this is not true when the method of moments is used to estimate 
the parameters. Hence we obtain: 
(ete (Gio) ema 2G; 0) ee en 

It appears natural to assign degrees of freedom p — 3 to the first 
term and 2 to the second, both on the right hand side of the last relation, 
as done by Pimentel-Gomes [5] and Stevens [9]. Then the first term 
on the right of (3.2), divided by p — 8, provides an estimate for the 
variance of the deviations from the regression equation, which should 
not be significantly different from the residual variance if the regression 
curve fits the data well. The second term on the right of (3.2), divided 
by 2, gives an estimate, with two degrees of freedom, for the variance 
attributable to effects of regression by Mitscherlich’s equation, and 
should be significantly greater than the residual variance. 

As there are three parameters in the equation to be fitted, at least 
four levels of fertilization (the control included) should be used, in 
order that at least one degree of freedom should be left for testing the 
goodness of fit of the curve. But a large number of levels is generally 
not good, specially if factorial designs are used, for the number of 
treatments would increase excessively. It seems that in most cases 
4 or 5 levels are sufficient. However, if previous experiments have 
eye pet ens with reasonable accuracy in the case 
levels. But the tise of tines Te a tis ne a cae eras 
be dangerous maemenine evels without previous testing seems to 

) : mples are known where the law was un- 
successful to describe the increases in yield produced by the fertilizer. 


4. Steven’s Method 


Tn 1951 W. L. Stevens published an article [9] in which a new method 
of estimation of the parameters in Mitscherlich’s equation is introduced. 
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He uses the equation in the form y = a + Bp” and supposes + = 0,1, ---, 
p — 1. We have, therefore, 


a= A, p = 107%, Bg = —A10°". 


The method of least squares leads now to a system of equations 
exactly equivalent to (2.1). Following Fisher’s method, Stevens 
takes a rough estimate r’ for r, substitutes r’ + 6r for r in the normal 
equations and discards the terms where ér has exponent greater than 
one, assuming that 6r is small. He then computes corrections for the 
rough estimates a’, b’, r’ and obtains the efficient estimates 


a=F> y+ Fa Dy” + Fee > tyr’, 
b=F. Dy thi yr? + Fie Do ay”, 
r=r'+4 Or, 
where 
it en ah ee Pe trad )/ 0, 


F.., Fa, Far , Fy, and F;, being the elements of the reciprocal of the 
matrix 


n A: yr!” > gr’? 
Sr yr Sees 
>>} ar'** train piecing 1G 


which were tabulated by Stevens [9]. 

However, if the preliminary estimate r’ (obtained, as Stevens sug- 
gests, by graphical interpolation or by other inefficient methods) is not 
good enough, the convergence is, sometimes, rather slow. It is more 
difficult, also, to obtain an upper bound for the error committed by 
stopping the iterative process at a certain stage, and the computations 
are more troublesome. In addition, the tables published include only 
values of z from 0.25 to 0.70 for the cases of 5 or 6 levels, and from 0.30 
to 0.75 for the case of 7 levels. If in a series of experiments some of the 
values of r fall within these limits and some outside of them, we are 
going to get in trouble if we want to use Mitscherlich’s law in the analy- 
sis of all of them. This difficulty, of course, can be solved by extending 
the tables to cover a wider range of values. ' 
Also, it sometimes happens that the preliminary estimate is outside 
the interval covered by the tables, while the true value is within it. 
This was the case, for example, in an experiment carried out by Dr. 
W. L. Nelson, of the Agronomy Department of the North Carolina 
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State College, in 1946. The fertilizer was superphosphate, at the 
levels of 0, 40, 80, 120 and 160 pounds per acre, and the crop was Irish 
potato. The mean yields obtained were as follows in pounds per 
plot of 1/65 of an acre. 


| | | 


Yield 229.1 231.8 254.2 250.6 249.6 


A preliminary estimate could be obtained as follows, as suggested 
by Stevens: 


, _ 249.6 — 231.8 
~ 250.6 — 229.1 


Another rough estimate could be obtained by taking 
ee 254.2 + 250.6 + 249.6 


= 0.828. 


3 = 251.5, 
— b = 251.5 — 229.1 = 22.4, 
, _ 251.5 — 231.8 _ 
i 59.4 = 0.879. 
If we try r’ = 0.70, then we obtain 6r = —0.333 and an improved 


estimate is 
r = 0.70 — 0.333 = 0.367 = 0.37. 


This value is next corrected to 0.667 = 0.67 by the same method. 
A third iteration gives r = 0.455. 

The true value lies between 0.57 and 0.58. We see, therefore 
oer in this case the convergence is rather slow if Stevens’ method FS 
used. 

Assuming that the tables are extended, as suggested above, another 
difficulty to be found in the applications can be averaplified by the 
present case. In fact, there is no certainty, from the theory, that the 
iteration process will converge [10, 11]. Examples can bs, built up - 
where 7’ + 6r will be farther from the true root than 1’, even if this is 
not usually the case. But one should think, maybe that these “‘path- 
ological” cases do not happen in practice. - The above data show that 
this is not true. In fact, if we take r’ = 0.80 as a preliminary estimate 
we shall find 6r = 1.1, which is clearly absurd, since the true root ig 
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between 0.57 and 0.58 and r cannot exceed one. It would be possible 
to avoid this difficulty by the application of the ‘damped least squares” 
[10], but then the computational work would be heavier. 


5. The Economic Aspect of Fertilization 


The use of Mitscherlich’s equation, when appropriate, allows 
suitable consideration of the economic factors behind any experiment 
on fertilizers. The most important criticism against its use in this 
sense is that it does not take in consideration the interactions between 
fertilizers. This criticism can be answered by the following statements: 

I) In experiments with fertilizers the interactions are usually rather 
unimportant for a reasonably wide range of the amounts of the nutrients; 

II) If the experiment shows significant interactions, different equa- 
tions should be worked out for each level of the other fertilizers under 
consideration, the general approach to be described below can be used 
for each case, and the best solution selected; 

III) If the experiment does not allow proper consideration of the 
interactions (which is not unusual, partly due to the acceptance of 
statement I) then this method gives the most profitable level of ferti- 
lization under the conditions of the experiment, which is the best that 
can be done under the restrictions imposed by the data. 

Therefore we shall consider that the amounts of just one fertilizer 
are to vary, all others being kept fixed. If ¢ is the price of one unit of 
this particular fertilizer, an increment dz of its amount produces an 
increase dH in the expenses, being 


ak = tdz. 


From Mitscherlich’s equation we find that in the first year the 
increase of yield obtained will be 


dy, = A,c(1/log e)10°°**” dz, 
where log denotes the common logarithm and e = 2.718 --- is the basis 


of natural logarithms. Hence the increase in the income in the first 
crop will be 


dl, = A,we(1/log e)10°°**” dz, 
w being the price of one unit of the crop yield. But the fertilizer has a 


residual effect which will appear in the following years. The increase 
in the yield in the (¢ + 1)th crop, attributable to the last increment 


dz applied in the first year may be taken as 
dy; = AH (1/log A10a te wd; 
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where H,(0 < H; < 1) is the fraction of the fertilizer applied in the 
first year still available to the plant. Hence we find 


dI, = A,weH (1/log 210°" dev. 


It seems reasonable to put 


(5.1) ; al, a a Ap zd, Sy 
f > 1 being a safety factor which also takes in account the interest 
paid on capital invested in fertilizers. If « < 2* is the solution of 
(5.1), then we say that x* is the most profitable level of fertilization, 
meaning that, under the conditions of the experiment, it is the highest 
amount to be used such that every dollar spent with fertilizers brings 
an increase in the yield large enough to cover safely this expense and 
the interests paid on it and give some profit. 

The solution of (5.1) will be usually rather difficult to obtain, 
except if some simplifying assumptions are made. It seems reasonable, 
for example, to substitute every A;(i = 1, 2, 3, ---) by an average 
value A. Also we may take ' 


H, =h, 
with 0 < Ah < 1, even if this seems to be a pessimistic hypothesis, for 
when a fertilizer is applied to the soil the loss in the first and second 
year is large, but afterwards the losses are relatively small. However, 
in this case a pessimistic hypothesis seems to be better than an optimistic 
one, and, in addition, usually new applications of the fertilizer will be 


made in the following years, so that the losses will be high. 
Inequation (5.1) becomes now 


(5.2) Awe(1/f log e)10-°**” S(f, h, x) > t, 
where 


Sf; Nya) = D Chit Ps) Ogata ae 


It can be shown that a reasonable approximation to the sum of 
this series, which underestimates its value, is 


Sia 19°27 1/ 9 AG-A) 
h . 


f- 


If this approximation is substituted in (5.2), we obtain 


(5.3) Ave > 10" — nyH0%s? 4-7 iog e), 
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hence 

x Awe 

0.4 zx < (1/ck | oe __—_—__——— _ ys 
(5.4) v < (1/ck)} log peer Hone ie | fee 
where 


k=1—A(1 —h)/f. 


If we take h = 0, that is, if we assume that no residual effect is 
taken in consideration, then we find 


Awe ~~ 
ft loge 


(5.5) xz* = (1/c) log b, 
a well known formula, which has been successfully applied in Brazil 
[4, 5, 8]. 

The minimum value of k is easily found to be 1 — (1/4f). Bra- 
zilian experts on fertilizers generally agreed that a value f = 1.5 would 
be suitable in most cases. Then the minimum value of k would be 
0.833, which is not very different from one. Hence a simpler and 
conservative formula for x* would be 


(5.6) x* = (1/c) log (Awc/t log e) — 6, 


where we take k = f —h = 1. 

Formula (5.4) gives the amount of fertilizer x, = «* to be applied 
in the first year, assuming that no nutrient of that kind has been added 
for some years. But in the second year (or second crop) we should 
have an inequation similar to (5.4) with x substituted by x + hz*, so 
that the most profitable level would be now 


Lp = z*(1 aan h). 
It is easily seen that all the following amounts of fertilizer to be 


applied are going to be equal to 2*(1 — h) also. 


6. The Estimation of h 


The estimation of h can be carried out when the same experiment 
is repeated in the same plots in two or more successive years (or crops). 
Suppose that in the first year the Mitscherlich equation was 


Yi ae A,f[l a 02 


If no further application of the fertilizer is carried out, we may 
assume that in the following year the appropriate equation will be 


Yo = A,fl = 102 = A,[1 re 1Oay se eked: 
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The estimation of c’ = ch and b’ = b,/h is carried out, of course, 
in the same way as that of c and b described in section 2 above. Hence 


Al eX rae 
h = _ De = b/h. 
C 


From the few experiments we have in which the application of this 
method was possible it seems that, if the soil had not received the 
fertilizer for some years before, 6, and 6, are essentially estimating 
the same thing. When this is true, a better formula for the estimation 
could be used and would give a value of h between ¢’/é and b, / bY 
However more experiments must be carried out and studied before 
we can advance further in this direction. 

If the fertilizer is applied to the plots, in the second year, in the 
same amounts as before, we have 


c =c(1+h), b’ = bo/(1 +h), 
so that 
h=é'/é—1. 
7. The Variance of the Estimates of the Parameters 


Stevens [9] showed that the variances and covariances of the esti- 
mates a, b, r of the parameters a, 8, p are given by the formulas 


Va) = F..8", V(b) = Fus’, V(r) = (F,,/b")s”, 
Cov(a, 6) = Fs’, Cov(a, r) = (F.,/b)s", Cov(b, r) = (F,/b)s’, 


where s° is, as usual, the estimate of the residual variance, and F., 
Fy , etc. are rational functions in r, the same described in section fe 
For the important cases of three and four levels these functions were 
not tabulated, but they are rather simple, so that the tabulation is 
not too important. In fact, for the case of three levels they are: 


F.=(1—rn) “1+ 4 +r‘), 

Fa = —(1 — 7)“ + 8P + 27°), 
Foe= —1—n)“1+ni+r+P), 
Py = (1 — “(2 — 4p — 47? + 1254), 
Bae =" — 9) (1 sr are 

F,, = 21 —r)71+r+7). 


For the case of four levels we obtain: 
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e 
| 


oo = Q(r)(1 + 4r? + 10r* + 47° +7), 
Fay = —Q(r)(1 + 37° + Or* + 4r* + 37’), 
F., = —Q(@)(1 — r)(1 + 2r + Br? + 4r® + 5r*t + Or + 7°), 
Q(r)(3 — 4r + 6r? — 12r° + 27r'), 
Fy, = Q(r)(1 — r)(1 + Gr? + 47° + 9r'), 
F,, = Q@\1 — (8 + 21? — ar? — r* — 374), 
where 1/Q(r) = 2(1 — r)*(1 + 2r + 47? + 27° +r). 
In all these functions r is equal to the root z in section 2. 
To avoid confusion between the estimate b of 8 and the parameter 
b which appears in the original Mitscherlich equation, we shall denote, 
in this section, from now on, the estimate of 8 by 8. 
Estimates for the variances of A, 6 and ¢ can be obtained also. 
In fact, since A = a, we have V(A) = V(a). Also, since r = 10°*%, we 
obtain 


oP 
I 


dr = —10°°%(1/log e)q dé = —r(1/log e)q dé, 
and, as (1/log e) = 2.30, we have 
es V(r) F,,s° 
(a! i = 
A} ©) : (2.3 rgb)” 
Now 


from which we obtain 
6 = (q/log 7) log (—8/a), 


hence 5 
db = (q/log n| -a/a da + (1/8) a8 + MELO ar]. 
Therefore 
(1/a)*V(a) + (1/8)°V(B) + wV@) 
fea) V(b) = Cogn? — (2/aB) Cov(a, B) — (2u/a) Cov(a, r) 


+ (2u/B) Cov(8, 7) 
or 
iy ae Rohde + (1/8)°F os + (u/B)"F er — yee 
et — (2u/ap)F., + (2u/6’)For 


where wu = log (a/— B)/r log r. 
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8. Example of Analysis 

In an experiment on Irish potatoes carried out in the Tidewater 
station by Dr. W. L. Nelson, of the Agronomy Department of the 
North Carolina State College, in 1945, triple superphosphate was 
applied at the levels of 0, 40, 80, 120 and 160 pounds of P.O; per acre. 
Four replications, in randomized blocks, were used. The plot size 
was 1/65 of an acre, and gypsum was used to compensate for differences 
in calcium due to unequal amounts of superphosphate. Uniform 
application of 100 Ib. nitrogen and 160 |b. potash was carried out. In 
1946 the same amounts of superphosphate were applied again to the 
same plots, and there was a further addition of 120 lb. nitrogen and 
200 Ib. of potash, uniformly. The yields in the two crops are given 
below, in pounds per plot. 


P.O; (Ib./acre) 
Year _——_ — | 1 ae 
0 40 80 120 160 
199.2 292.3 387.2 491.4 387.2 
1945 210.3 387.2 418.8 504.5 629.2 
2538.1 396.6 508.3 446.8 §23.1 
268.0 400.2 508.2 psi il ™ 506.5 
Subtotals 930.6 1476.3 1822.5 1965.8 2046.0 
103.0 193.5 215.5 205.5 217.0 
1946 110.5 188.5 227.5 234.5 243.0 
103.5 194.0 201.0 206.5 227.5 
102.0 178.5 203.0 224.0 237.0 
Subtotals 419.0 754.5 847.0 870.5 924.5 


a ee ee ee Ee eee 


If we assume that (3) i j ipli 

: ‘ quation 2.6 

7 P , : x ) 1s multiplied by the number of 
rep cati ons, which 1s 4, we obtain for 1945 the equation 


930.6/5:(2) +f 1476.3 59(2) + 1822.5Jsa(2) + 1965.8] 54(2) 
+ 2046.0/;5(2) = 0. 
The equation Corresponding to (2.8), which is easier to solve, is 


R() = 545.7 50(2) + 891.9F55(z) + 1035.27 5.(2) + 1115.4J,5(2) = 0. 
The tables give us, for z = 9 


=> 
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J5(0) = —3, Js3(0) = J5,(0) = J;;(0) = 1. 
Therefore we obtain: 
R(O) = —(545.7)3 + 891.9 + 1035.2 + 1115.4 = 1405.4. 
For z = 1 we find in the same way: R(1) = —28347.5. Since 


R(0) and R(1) have opposite signs, we know that a root does exist 
between zero and one. Let us try, for example, z = 0.50, and let us 
take the values from the tables with just one decimal place to begin 
with. We find: 


R(O.50) = (545.7)(—9.9) + (891.9)(—6.0) + (1035.2)(1.7) 


+ (1115.4)(8.5) 
= 486.91. 


Hence the root lies between 0.50 and 1.00. A few more trials show 
that R(0.53) = 105.26, R(0.54) = —37.32, all three decimal places 
in the tables being now used. The usual linear interpolation gives 
then z = 0.5374 as an estimate of the root, with an error surely smaller 
than 0.01, probably much smaller than this value. Now we find by 
(2.7) 

pre 0 0 12 008743" 
40 
since g = 40 lb. P.O; per acre, in this case. 

The computation of A and 6 is now carried out by formulas (2.12) 

or (2.13), and (2.14), and the equation obtained is 


y= 539 1[1 ae Tae ae tat cao ca 


or 
y = 539.1 — 307.5(0.5374)” , 


where x’ = 2/40. From either of these equations the expected values 
G;(¢ = 1, 2, --- , 5) can be calculated. 


Treatments (Ib. P.O; per acre) 0 40 80 120 160 


Expected mean yields (Ib./plot) | 231.56 | 373.84 | 450.29 | 491.38 | 513.44 


Observed mean yields (Ib./plot) 232.65 | 369.08 | 455.63 | 491.45 | 511.50 
ee ec ee a ee ee ee ee 
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The sum of squares corresponding to deviations from the regression 
curve can now be computed with the aid of the expected and observed 


mean yields, as follows: 
4[(231.56 — 232.65)” + (373.84 — 369.08)? + --- 
+ (513.44 — 511.50)*] = 224.52. 


The analysis of variance can now be completed with the results 
shown below. 


Source of Variation Degrees of Sum of Mean 
freedom squares square 
Blocks 3 25 , 130.67 8376.9 
Regression by Mitscherlich law 2 208,274.09 | 104,137.0*** 
Deviations from regression 2 224.52 112.3 
(Treatments) (4) (208 , 498.61) 
Residual 12 30,310.73 2525.9 
Total 19 263,940.01 


In the table above the three asterisks denote significance at the 
0.1% level of probability. 


Now, by linear. interpolation in Steven’s tables, we obtain 
F,.(0.5374) = 3.1715, 
so that 
V(A) = (8.1715)(2525.9)/4 = 2002.72, 
hence s(A) = 44.75. We divide above by 4 because the curve was 


fitted to the means of 4 replicates. 
Similarly we obtain 


sé) = 0.00259, 
s(6) = 30.94. 
For the 1946 data the same method of fitting led to the equation 
i= 227.4[1 :, URS ong 


MITSCHERLICH’S REGRESSION LAW 513 


or 
y = 227.4 — 122.3(0.3355)",, 


i Se : af : se Wes 
where x’ = 2/40. The new analysis of variance is given below. 


Source of Variation Degrees of Sum of Mean 
freedom squares square 
Blocks 3 686.54 228.8 
Regression by Mitscherlich law 2 40,573.63 20 , 286.8*** 
Deviations from regression 2 202.05 101.0 
(Treatments) (4) (40,775.68) 
Residual 12 952.52 79.38 
Total 19 42,414.74 
We have: 


a 


é-= 0.006743, 6, = 36.15, 


A 


é’ = 0.01186, b’ = 22.72. 
so that 
ai®) -“OH1186 
ree 0.006743 p= 010s 


To compute x* we must first estimate the average maximum yield 
A. The yield in 1946 was much lower than in 1945, due to unfavorable 
meteorological conditions. The mean of the two maximum yields 
possibly is not, therefore, a good estimate of A. However, in the 
absence of better estimates, we may use it. So 


q _ 039.1 + 227.4 


A 9 = 383.2 = 380. 


We find now, by (5.4), with f = 1.5, 


wesrere /0.38)| ; ve De 36.15 ib. BiOwyacee: 


where w is the selling price of one pound of potatoes and ¢ is the cost 
of one pound of P.O; under the form of triple superphosphate. 
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9. Tables of Polynomials 


z J41(2) J42(2) J43(z) Jas(z) | Joi(2) J52(2) Js3(z) Jo) Jos(2) 
0.00 | 0.000 —2.000 1.000 1.000 | 0.000 —3.000 1.000 1.000 net 
0.0i | 0.020 —2.040 0.980 1.040] 0.031 —3.071 0.959 1.040 ne 
0.02 | 0.041 —2.081 0.959 1.081 | 0.062 —3.144 0.916 1.080 be 
0.03 | 0.063 —2.123 0.937 1.123 | 0.096 —38.218 0.872 1120) 12 
0.04 | 0.085 —2.165 0.915 1.165 | 0.130 —3.295 0.825 1160S 

5 —2.208 0.892 1.208| 0.166 —3.374 0.776 1.200 1.232 
ae oe —2,251 0.868 1.251 | 0.204 —3.454 0.725 1.239 1.286 
0.07 | 0.156 —2.295 0.844 1.295 | 0.244 —3.537 0.671 1.279 1.343 
0.08 | 0.181 —2.340 0.819 1.340] 0.285 —3.622 0.615 1/318, 1.403 
0.09} 0.207 —2.386 0.793 1.386 | 0.328 —3.709 0.557 1.357 1.467 
0.10 | 0.234 —2,432 0.766 1.432 | 0.373 —3.798 0.496 1.396 1.533 
0.11] 0.262 —2.479 0.738 1.479 | 0.421 —3.890 0.432 1.435 1.603 
0.12 | 0.290 —2.526 0.709 1.527] 0.470 —3.984 0.365 1.473 1.676 
0.13 | 0.320 —2.575 0.680 1.575 | 0.522 —4.081 0.295 1.511 1.753 
0.14 | 0.350 —2.624 0.649 1.624 | 0.576 —4.179 0.222 1.548 1.833 
0.15 | 0.382 —2.674 0.618 1.674] 0.633 —4.281 0.146 1.685 1.917 
0.16} 0.414 —2.724 0.585 1.725 | 0.692 —4.385 0.067 1.621 2.005 
0.17 | 0.447 —2.776 0.552 1.777 | 0.754 —4.492 —0.016 1.657 2.097 
0.18 | 0.482 —2.828 0.517 1.829} 0.819 —4.601 —0.103 1.692 2.193 
0.19 | 0.517 —2.881 0.482 1.882] 0.888 —4.713 —0.193 1.726 2.293 
0.20| 0.554 —2.934 0.445 1.936 |} 0.959 —4.829 —0.288 1.760 2.897 
0.21| 0.591 —2.989 0.407 1.991] 1.034 —4.946 —0.387 1.792 2.506 
0.22] 0.6830 —3.044 0.368 2.046] 1.113 —5.067 —0.490 1.824 2.620 
0.23] 0.670. —3.100 0.327 2.103} 1.195 —5.191 —0.597 1.854 2.739 
0.24] 0.711 —3.157 0.285 2.160] 1.282 —5.318 —0.709 1.883 2.862 
0.25 | 0.754 —3.215 0.242 2.219] 1.372 —5.448 —0.896 1.911 2.991 
0.26] 0.798 —3.273 0.198 2.278] 1.467 —5.581 —0.948 1.938 3.125 
0.27] 0.843 —3.333 0.152 2.338 | 1.566 —5.718 —1.075 1.962 3.265 
0.28 | 0.885 —3.393 0.105 2.399] 1.670 —5.858 —1.208 1.986 3,410 
0.29] 0.937 —3.454 0.056 2.461] 1.779 —6.001 —1.347 2.007 3.561 
0.30] 0.986 —3.516 0.006 2.524] 1.893 —6.147 —1.491 2.027 3.718 
O.81 | 1.087 "3.579 —0.046 2.588] 2.013 .—6 297 —1.641 2.044 3.881 
0.32) 1.089 3.642 0.099 2.653] 2.138 —6 451 —1.798 2.059 4.051 
0.33) 1.142 —3.707 —0.154 2.719] 2.970 —6.608 —1.961 2.072 4,228 
O.84 | 1.197 8.772 0.211 9.788 | 9.407 —6.768 —2.132 2.082 4.411 
0.385) 1.254 —3.838 —0.269 2.853] 2.551 —6.933 —2.309 2.089 4.602 
0.36 | 1.312 —3.905 —0.329 2.992| 2701 — Tel Ol — non 2.094 4.800 
0.37 | 1.372 —3.973 —0.391 2.999 | 2.859 —7.272 —2.686 2.095 5.003 
0.38) 1.434 —4.042 —0.454 3.063 | 3.093 —7.448 —2.886 2.093 5.218 
0.300)" 4.407 4.112 0.520 «3-195 [3106 —7.627 —3.095 2.087 5,439 
0.40.) Osta oe On8r es 20e i atayaemie te —3.312 2.077 5.669 
0.41 / 1.628 —4.254 —0.656 3 989 3.565 —7.998 —3.538 2.064 5.907 
0.42 1.697 —4.326 —0.728 3.357 3.762 —8.189 —3.773 2.046 6.154 
0.43 | 1.767 —4.400 —0.801 3.434 3.968 —8.385 —4.017 2.023 6.410 
0.44 | 1.839 —4.474 —0.876 3.511 4.184 —8.584 —4.971 1.996 6.676 


0.45 1.9183 —4.549 —0.954 3.590 


4.409 —8, He 
0.46 | 1.989 —4.625 ~1.034 hel es Es 


3.669 | 4.644 —g 995 — 
0.47| 2.067 —4.702 ~1.116 3.759 oo ee 


1 
4.890 se 

: —9.207 —5.097 1.8 7. 
0.48) 2.147 © 4.770) 2 200 Maelass Wanan yay 9.4283 —5.394 81 81 532 
O.A05}) 2.220) 4.858 Stoke a aie meeeay= —9.643 —5.703 1.774 8.156 
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z Ja(z) J 42(z) J43(z) J 44(2) J51(z) J 52(z) Js3(z) Js4(z) Js5(z) 
0.50 2.313 —4.938 —1.375 4.000 5.695 —9.867 —6.023 i clegital 8.484 
0.51 2.399 —5.018 —1.466 4.086 5.988 —10.096 —6.357 1.640 8 825 
0.52 2.48% —5.099 —1.560 4.172 6.293 —10.328 —6.703 1.562 9.177 
0.53 2.577 —5.182 —1.656 4.260 6.611 —10.565 .—7.063 1.475 9.542 
0.54 2.670 _—5.265 —1.755 4.350 6.9438 —10.807 —7.436 1.380 9.919 
0.55 2.765 —5.349 —1.856 4.44 7.289 —11.052 —7.823 
0.56 2.862 —5.434 —1.960 4.532 7.651 —11.302 —8.226 ee Gee 
0.57 2.961 one  —2.067 4.625 8.027 —11.556 —8.643 1.040 11.132 
0.58 3.063 —5.606 —2.176 4.71 8.420 —11.814 —9.076 0.906 11.564 
0.59 3.167 —5.694 —2.288 4.815 8.829 —12.076 —9.525 0.761 12.011 
0.60 32214! (5.782 2.403 4.912 9.255 —12.342 —9.991 0.605 12.474 
0.61 b.dbG.. 5.842  —2.521 6.010 9.699 —12.613 —10.474 0.436 12.952 
0.62 3.494 —5.962 —2.642 5.110} 10.161 —12.887 —10.975 0.255 13.446 
0.63 3.608 —6.053 —2.766 5.211 | 10.643 —13.166 —11.494 0.061 13.957 
0.64 3.725 —6.145 —2.893 5.313 | 11.144 —13.448 — 12.033 —0.148 14,485 
0.65 3.845 —6.238 —3.023 5.417 | 11.666 —13.734 —12.590 —0.371 15.030 
0.66 3.967 —6.332 —3.156 5.522 | 12.208 —14.024 —13.168  —0.610 15.593 
0.67 4.091 —6.427 —3.293 5.628 | 12.773 —14.318 —13.766 —0.865 16.176 
0.68 4.219 —6.522 —3.433 5.736 | 13.361 —14.616 —14.386 —1.136 16.777 
0.69 4.349 —6.619 —3.576 5.845 | 13.972 —14.917 —15.027 —1.425 17.398 
0.70 4.482 —6.716 —3.722 5.956 | 14.607 —15.221 —15.691 —1.733 18.039 
0.71 4.618 —6.814 -—3.872 6.068 | 15.267 —15.529 —16.379 —2.060 18.700 
0.72 4.757 —6.913 —4.026 6.182 | 15.954 —15.840 —17.090 —2.407 19.384 
0.73 4.899 —7.013 —4.183 6.297 | 16.667 —16.154 —17.826 —2.775 20.089 
0.74 5.044 —7.1138 —4.343 6.413 | 17.408 —16.471 —18.587 —3.166 20.816 
0.75 ie eo 4 Se Oost toad 16.701 | — 1903748 — 3 1919 ot OG 
0.76 See42) «7.01% —4.676 6.651.) 18.977 17.1146 —20.188. - —4.016 922.342 
0.77 5.496 —7.420 —4.848 6.772 | 19.806 —17.439 —21.030 -—4.478 23.141 
0.78 5.654 —7.524 —65.024 6.894 | 20.668 —17.766 —21.900 —4.967 23.965 
0.79 5.814 —7.629 —5.203 7.018 | 21.562 —18.095 —22.800 —5.482 24.815 
0.80 5.978 —7.734 —5.387 7.144 | 22.490 —18.427 —23.729 —6.026 25.692 
0.81 6.145 —7.841 —5.575 7.271 | 23.452 —18.759 —24.689 —6.599 26.595 
0.82 6.315 —7.948 —5.767. 7.400 | 24.450 —19.094 —25.680 -—7.204 27.527 
0.83 6.488 —8.056 —5.963 7.530 | 25.486 —19.429 —26.704 —7.840 28.487 
0.84 6.665 —8.164 —6.163 7.662 | 26.559 —19.765 —27.762 —8.509 29.477 
0.85 6.846 —8.274 —6.368 7.796 | 27.672 —20.102 —28.854 —9.214 30.497 
0.86 7.030 —8.384 —6.577 7.931 | 28.826 —20.440 —29.980 —9.954 31.548 
0.87 7.218 —8.495 —6.790 8.068 | 30.021 —20.777 —31.143 —10.732 32.631 
0.88 7.409 —8.606 —7.008 8.206 | 31.259 —21.114 —32.343 —11.548 33.746 
0.89 7.604 —8.719 —7.231 8.346 | 32.542 —21.451 —33.581 —12.406 34.895 
0.90 7.802 —8.832 —7.458 8.488 | 33.871 —21.786 —34.858 —13.305 36.078 
0.91 8.004 —8.946 —7.690 8.631 | 35.247 —22.120 —36.175 —14.248 37.297 
0.92 8.210 —9.060 —7.927 8.777 | 36.672 —22.453 —37.533 —15.237 38.551 
0.93 8.420 —9.175 —8.168 8.923 | 38.146 —22.783 —38.933 —16.273 39.843 
0.94 8.634 —9.291 —8.415 9.072 | 39.672 —23.111 —40.376 —17.357 41.172 
0.95 8.852 —9.408 —8.666 9.222 | 41.252 —23.486 —41.863 —18.493 42.541 
0.96 9.073 —9.525 —8.922 9.374 | 42.886 —23.758 —43.395 —19.681 43.949 
0.97 | 9.299 —9.643 —9.184 9.528 | 44.576 —24.075 —44.974 —20.924 45.398 
0.98 9.528 —9.761 —9.451 9.684 | 46.324 —24.389 —46.600 — 22.223 46 .889 
- 0.99 9.762 —9.880 —9.723 9.841 | 48.131 —24.697 —48.275 —23.581 48.422 
1.00 | 10.000 —10.000 —10.000 10.000 | 50.000 —25.000 —50.000 —25.000 50.000 


CC LL 
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For the subsequent crops the most profitable level of application 
of superphosphate would be estimated by 


a*(1 — h) = 0.242%. 
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THIRD INTERNATIONAL BIOMETRIC CONFERENCE 


PROCEEDINGS 


Hotel Grande Bretagne, Bellagio (Como), Italy. 
Tuesday, September 1, 1953. 


The Conference was convened by the President of the Society, Pro- 
fessor G. Darmois, shortly after 9 a.m. He introduced Dr. C. Barigozzi, 
Vice-President of the Italian Region, and Professor A. Buzzati-Traverso 
of the Organizing Committee, who, in brief addresses, welcomed’ the 
Conference to Italy and to Bellagio on behalf of the Italian Government 
and of the Organizing Committee. Their welcoming greetings were 
followed by the Presidential Address of Professor G. Darmois, which 
was then summarized in English by Dr. Cavalli-Sforza. 

The Conference continued with a symposium on the first course in 
biometry (for the scientific program see pages 525-526), and at 12:30 p.m. 
with the first general business meeting. After calling the meeting to 
order, President Darmois appointed Drs. Maria-P. Geppert (Chair), 
Leopold Martin and Harold Hotelling as a Resolutions Committee to 
report at the closing business session, and the Conference elected Drs. 
J. W. Hopkins and A. Linder auditors of the accounts of the Organizing 
Committee. The President called upon Secretary Bliss to report on 
developments in the Society in the four years since the last international 
conference. The Secretary commented on the new Directory, then in 
press (see page 535), noted that the proceedings of the International 
Biometric Symposium in Calcutta in December, 1951, would soon be 
distributed to all members, and remarked on the many meetings organ- 
ized by the Regions and the National Secretaries for their members. 
He reviewed the negotiations with the American Statistical Association 
which led to the acquisition of Biometrics as the journal of the Society, 
and its growing importance under the able editorship of Professor 
Gertrude Cox. He closed with a report on the recent meetings in Nice 
of the I.U.B.S. (see page 535). Following announcements by Dr. 
Cavalli-Sforza the meeting adjourned. A scientific session on mathe- 
matical problems in genetics completed the program for the day. 
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Methodological problems in biometry, and biometry in immunology 
were the subjects of the scientific programs in the morning and afternoon 
of September 2nd. In the evening five exhibits were on display, and 
there was a meeting of the Council of the Society, attended by the follow- 
ing Council members and officers: Bliss, Cavalli-Sforza, Cochran, Cox, 
Darmois, Fieller, Finney, Fisher, Geppert, Gjeddeback, Hopkins, Irwin, 
Linder, Martin, Mather, Rao, Schwartz, van der Laan, and Yates. 
Among other decisions, it approved a constitutional amendment to 
change the current designation of “Vice-President” to ‘Regional Presi- 
dent”’, and the appointment of National Secretaries for Brazil, Sweden, 
and Switzerland. 

Following the morning session of September 3rd on biometric meth- 
ods in agriculture, motor boats took members of the Conference and 
guests for an excursion on Lake Como, ending at Villa d’Este in Cernob- 
bio for tea and returning about dusk. In the evening the Azienda 
Autonoma di Soggiorno di Bellagio entertained the Conference at a 
reception and dance at the Lido di Bellagio. 

The program on September 4th opened with papers on functional 
relations in experimentation and on biometric problems in genetics. At 
noon a business meeting on Biometrics was chaired by Editor Cox. Its 
purpose was to discuss editorial policy, answer questions concerning 
procedure and management and to obtain suggestions for the improve- 
ment of the journal. The afternoon program of contributed papers was 
followed in the evening by a banquet and dancing at the Grand Hotel 
Villa Serbelloni. 

The morning program on September 5th concerned industrial appli- 
cations of biometry. At 11:30 Vice-President Barigozzi took the chair 
for the closing business session. He called first on Professor W. G. 


Cochran for a report of the Committee on the Teaching of Biometry, 
which follows. 


Report of the Commiitee on the Teaching of Biometry 


Two years ago, the committee distributed a questionnaire to mem- 
bers, in order to find out to what extent members are interested in 
problems of teaching and to obtain suggestions from members about the 
work of the committee. 

There was a great variety of answers to the question: “What can this 
committee do that would be useful to you?”’ The most common request 
was for the publication of the contents of good courses in biometry, as a 
guide to teachers of the subject. Other members would like lists of 
problems which might be used as exercises for the students, and some 
members pointed out the difficulty of finding interesting araniples which 
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the teacher can use to illustrate the application of the common statistical 
techniques. 

Suggestions were also received for assistance in keeping up with 
developments in the field by the publication of bibliographic information 
or of articles summarizing recent advances; for the simplification of 
techniques so that they would be more easily understood by biologists; 
and for finding out what techniques are actually used by biologists, in 
order to make instruction more realistic. 

The first opportunity for a meeting of the committee occurred at the 
Third International Biometric Conference at Bellagio. At this meeting 
the committee discussed plans for future work and decided to pursue 
three lines of action. 

1. To assemble for distribution accounts by experienced teachers of 
the purpose, content and mode of instruction used in their courses in 
biometry, including mention of teaching devices which they regard as 
highly successful and of difficulties which they encounter. First priority 
will be given to courses in which the students come from a fairly broad 
range of subjects within biology and second priority to courses for medi- 
cal students. The committee is of the opinion that such accounts will 
be of considerable interest to persons engaged, or about to engage, in 
the teaching of biometry. 

2. To begin work on a survey of the present status of the teaching 
of biometry to biologists, in order to find out how many students in 
biology receive instruction in biometry, and to obtain some information 
about the content and level of the instruction. The committee recog- 
nises that careful planning and execution will be necessary to obtain 
sound information of this type, and that financial assistance may be 
required for the task. The committee proposes to attempt the survey 
first in Great Britain and Italy under the direction of members from 
these countries. 

3. The committee realises the need for short and inexpensive intro- 
ductory books on biometry, which will present basic ideas without undue 
elaboration of formulae. The committee will take such steps as it can 
to stimulate the writing of books of this kind and their translation into 
other languages where this would be useful. 

Members of committee: C. I. Buiss, A. Buzzari-Travurso, W. G. 
Cocuran (Ch.), G. Darmors, K. MATHER. 


‘Dr. Frank Yates then reported that the Committee on the Standard- 
isation of Symbols, of which he was chairman, believed no recommenda- 
tions should be made at this time in view of projects pending in the I.8.I. 
and in other national and international organisations, and recommended 
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that the Committee be discharged. His recommendation was approved 
by the Conference. Dr. Geppert then presented in English the report 
of the Resolutions Committee, after which it was read in French by 
Dr. Martin. After discussion and amendment by the Conference, the 
following resolutions were adopted by acclamation. : 

Réunis en Assemblée Générale de Cléture, les participants a la 3me 
Conférence Internationale de Biometrie: 

1. Remercient Monsieur le Président du Conseil des Ministres de la 
République Italienne de la bienveillance qu’il a manifestée en faveur de 
organisation de la 3me Conférence Internationale de Biometrie qui 
vient de tenir ses assises 4 Bellagio. 

2. Remercient les Gouvernements, les Universités et les Instituts 
d’enseignement et de recherche qui ont envoyé des représentants au 
Congres. 

3. Remercient le Professeur G. di Francesco, Recteur de la Université de 
Milan pour l’appui tant moral que materiél qu’il a apporté a la réalisation 
de la Conférence. 

4, Remercient |’Istituto Sieroterapico Milanese Serafino Belfanti ainsi 
que toutes les autres Institutions et Organisations privées de l’aide 
matérielle et financiére qu’ils ont consenti a l’organisation de la Con- 
férence. 

5. Remercient les Membres du Comité Organisateur et les Présidents des 
Sections dont le travail a été trés fructueux. 

6. Remercient les Membres du Comité Executif et en particulier Mes- 
sieurs les Prof. Barigozzi et Buzzati-Traverso et les félicitent de l’organ- 
isation impeccable tant des conditions du travail que des divertissements 
des congressistes et de leur famille. Ils applaudissent spécialement le 
Dr. et Mme. Luigi Cavalli-Sforza dont le dévouement inlassable a été 
manifesté en toute occasion. 

7. L’Assemblée Générale émet a l’unanimité la résolution suivante: 

Vu V'importance toujours grandissante des méthodes biometriques, 
tant dans l’organisation de la recherche biologique pure et appliquée que 
dans Vanalyse des résultats experimentaux obtenus, vu l’importance 
Sue et économique des résultats des recherches en biologie appliquée, 
l Assemblée Générale émet le voeu de voir les gouvernements et les 
Universités établir sur base organique l’enseignement de la Biometrie au 
sens large, c’est_a dire des aspects statistiques et mathématiques de la 
biologie pure et appliquée. 

L’Assemblée prie l’Union Internationale des Sciences Biologiques de 
transmettre ce voeu a ’U.N.ES.C.O. 


Vu la grande importance scientifique et culturelle de cette résolution. 


< 
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P Assemblée demande instamment a l’U.N.E.8.C.0. de prendre en con- 
sidération ce voeu et d’en faire part aux divers gouvernements. 

8. L’Assemblée Générale recommande une liason active et une coopéra- 
tion plus étroite entre la Biometric Society, y comprises ses organisations 
régionales et nationales, avec les grandes organisations internationales, 
telles que l’Organisation Mondiale de la Santé (O.M.S.), la F.A.O., 
’LS.0. (Organisation Internationale de Standardisation), ainsi que 
l'Institut International de Statistique (I.8.I.). L’Assemblée insiste sur 
la nécéssité de l’application des principes biometriques a la solution des 
problémes recontrés tant a l’échelle nationale qu’internationale. 

9. Enfin, reeommande que dans chaque pays s’exerce une meilleure coor- 
dination des efforts en vue de la diffusion de la connaissance et de l’appli- 
cation des methodes biometriques dans des cercles plus larges. 

En vue de concrétiser ces projets l’Assemblée Générale propose dans 
un avenir prochain |’institution en Europe continentale de bréves session 
de vacance consacrées a la mise au point des problémes biometriques 
d’actualité. 

The auditors, Drs. Hopkins and Linder, reported that an examination 
of the finances of the Organizing Committee showed all funds to be 
accounted for and in good order, a report which the Conference adopted 
unanimously. The Secretary announced that the Council had given its 
tentative approval to holding the Fourth International Biometric Con- 
ference in Canada in 1958, and an International Symposium in Brazil 
in 1955. There being no new business, the Conference closed with a 
brief address by Dr. Barigozzi. 
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PRESIDENTIAL ADDRESS 


Dignités nouvelles de la Statistique dans la Recherche. 


PAR GEORGES DARMOIS 
Université de Paris 


Je voudrais d’abord remercier tous les dévoués organisateurs de cette 
Conférence. Chacun sait de combien de travaux divers, de solutions de 
difficultés renaissantes, imprévues, se compose une tdche de cette en- 
vergure. 

C’est quelque chose d’analogue 4 un mariage ou il faut penser & tout. 

Tout d’abord, c’est notre Secrétaire Général, le Docteur Cavalli- 
Sforza qui a porté en souriant le fardeau de cette préparation. Ensuite 
tous nos présidents des diverses sections, tous les Savants qui ont médité 
des Communications, et ceux qui ont pensé aux discussions, aux inter- 
ventions. 

Je vous remercie donc tous, qui avez bien voulu venir, quelques uns 
d’assez lo, comme nos amis des Etats-Unis, de l’Inde, pour nous amener 
le résultat de vos travaux, et de votre expérience. 

Dans cet admirable décor fourni par la nature avec tant de bonheur, 
choisi avec tant de godt, pour notre conférence, vous allez nous montrer 
tout la vitalité, et toute la force jeune et active de la Société de Biometrie. 

C'est justement de ces forces jeunes que je voudrais vous dire 
quelques mots. 

J’ai parlé, dans le titre, des dignités nouvelles de la Statistique dans 
la recherche. Bien entendu, ces dignités ne sont pas nouvelles pour vous 
qui les exercez, mais j’ai jugé utile de faire le point de la situation,- qui 
évolue d’ailleurs assez vite. 

Ce que j’ai voulu dire, c’est que le statisticien intervient maintenant 
beaucoup plus tét, et qu’au lieu d’étre associé tardivement 4 l’exploita- 


tion oe résultats, il a un réle qui commence avant les observations. 
Il n’y a pas tellement longtemps, on l’appelait dans les cas obscurs, 


ou le raisonnement ne paraissait pas réussir. 
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La Statistique, a dit autrefois Adolphe Thiers, est l’art de préciser les 
choses qu’on ignore. C’est ainsi qu’on présentait au statisticien un amas 
informe d’observations, faites sans but bien précis, et on lui disait: 
“Apportez-nous des conclusions Statistiques”. 

Il y a eu sur ce point une véritable révolution, et qui n’est qu’a ses 
débuts. Le Technicien qu’on appelait a la fin fait maintenant du “De- 
sign of Experiment’’. On peut certes attribuer 4 Sir Ronald Fisher un 
role trés important dans l’impulsion donnée a cette idée, qu’il est bon, 
qu'il est nécessaire d’organiser a l’avance les observations pour en tirer au 
mieux des renseignements. 

Sous une forme générale je dirais que: ‘Nous avons & chercher le 
meilleur chemin pour progresser vers la connaissance’. 

Bien entendu nous devons toujours rester modestes et de bon sens. 
On fait appel 4 nos services, mais ne soyons pas trop autoritaires et 
tranchants dans les conseils que nous donnerons, aprés les avoir tirés de 
la théorie. 

Le plus court chemin en un terrain accidenté n’est pas toujours le 
meilleur. I] peut méme étre inutilisable, et trop de raffinement dans la 
recherche de la perfection ne doit pas nous éloigner de |’action. 

A titre d’exemple important on peut évidemment considérer comme 
un prolongement des idées de la planification des expériences des 
réussites telles que les méthodes séquentielles, ot le cheminement vers 
la connaissance se fait par un nombre de pas non fixe d’avance, et 
s’arréte quand on atteint le degré de connaissance qu’on s’était donné. 

Et ceci m’améne 4 faire intervenir explicitement dans ce chemine- 
ment optimum une notion récemment introduite et trés étudiée par les 
Spécialistes des Communications, la notion d’information. 

Prise dans son sens général, une information qui s’améliore, c’est une 
modification favorable de notre connaissance d’une question. C’est une 
information utile, quand on vient d’acheter un fusil, que d’avoir les 
résultats d’un tir, soit de la dispersion d’une ou plusieurs cartouches, 
soit de l’observation de 10 coups sur uncible. 

C’est une information qu’une moyenne, ou un “range’’. 

La théorie dont nous parlons a visé la définition d’une grandeur 
capable de coter, 4 chaque instant d’un processus, la connaissance at- 
teinte, ou la distance qui reste 4 parcourir. 

Fisher a défini, dans la théorie de l’estimation d’un paramétre, une 
certaine grandeur qu’il a appelée information. Il s’agit ici d’une loi de 
probabilité de forme connue, dépendant d’un paramétre inconnu. Cette 
information est en vérité une capacité de la loi 4 renseigner sur le para- 

-métre. Elle donne lieu 4 de remarquables théorémes. 
Sous le méme nom d’information, Hartley puis Shannon et Wiener 
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ont associé a la loi de probabilité d’un étre aléatoire une grandeur visant 
3 coter la connaissance statistique de cette aléatoire par la loi considérée. 
C’est en fait une entropie. On avait cru comprendre que ces deux sortes 
WVinformations avaient un lien étroit. Il n’en est rien, elles font seule- 
ment partie d’une certaine famille de grandeurs ayant une structure 
formelle analogue. 

Ce résultat a été établi par Schutzenberger, qui, par des considéra- 
tions que ne nous reproduirons pas, a défini cette famille de grandeurs 
comme la valeur moyenne d’une opération linéaire faite sur le logarithme 
de la probabilité. 

Pour Shannon et Wiener, c’est le logarithme lui-méme, et pour 
Fisher, l’opération linéaire est la dérivation seconde. 

Il peut d’ailleurs en exister d’autres, et la méthode séquentielle de 
Wald pour séparer les valeurs de deux paramétres emploie une informa- 
tion de cette nature. 

Schutzenberger, dans une thése récente a étudié ces questions et les a 
appliquées a la Méthode des Groupages qui se présente, comme Dorfman 
l’a signalé le premier, dans le diagnostic d’affections de faible fréquence. 

L’emploi de l’information comme mesure du chemin parcouru vers 
la connaissance cherchée, permet l’emploi de bonnes et courtes méthodes 
pour parvenir au résultat. 

Signalons seulement que le diagnostic utilise |’information de Shan- 
non-Wiener, que l’estimation de la fréquence se sert de l’information de 
Fisher, le probléme de l’extraction ou du tri d’individus d’une catégorie 
emploie l’information de Wald. 

Je he veux pas insister sur les détails, malgré leur importance. 

J’ai voulu seulement signaler que, A cété des applications déja si 
nombreuses de la planification des expériences, des résultats nouveaux, 
et des perspectives nouvelles s’ouvrent par l’emploi de l’information, 
utilisée jusqu’ici presque exclusivement dans les problémes de Communi- 
cation. 

Rien n’est plus naturel au fond, puisque, comme nous l’avons dit, il 
s’agit de progresser par Pexpérience, vers la Connaissance d’une solution. 
see akc mea ie Use ai parlé ont été obtenus dans 
nique Ease Sa 1om pe la méthode des groupages s’ap- 
mania je Cane a oe e ae doute a la recherche industrielle, 
Vaste.au pine te ae 1om: trie offre un champ particuliérement 

u cheminement optimum. 
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SCIENTIFIC PROGRAM 


(A book of abstracts of the papers presented at the Con- 
ference was distributed to each participant. It is planned 
to reprint this book and send copies to all members of the 
Biometric Society. Each meeting was organized by the 
member of the Organizing Committee who served as its 
chairman.) 


September 1st. 9:30 a.m. Presidential address by G. Darmois. (see 
page 522). 

10a.m. The First Course in Biometry—a Symposium. Chairman: 
W. G. Cochran. L. Martin—Enseignement des principes d’expérimen- 
tation et des méthodes statistiques 4 des biologistes dans deux établisse- 
ments belges d’enseignement supérieur. G. Barbensi—L’insegnamento 
della biometria. C. I. Bliss—A course in biometry for graduate students 
in biology. A. Vessereau—Enseignement des méthodes statistiques 
appliquées 4 la biometrie. 

3 p.m. Mathematical Problems in Genetics—I. Chairman: A. 
Buzzati-Traverso. Sir Ronald Fisher—The variability in the length of 
germ plasm still heterogeneous after a given amount of inbreeding. 
K. Mather—The methodology of biometrical genetics. D. Lowry— 
Variance components with reference to genetic population parameters. 
J. L. Lush—Estimating heritabilities. 

September 2nd. 9 a.m. Methodological Problems in Biometry. 
Chairman: Gertrude Cox. J. W. Hopkins—Some needed significance 
tests. F. Anscombe—Fixed-sample-size analysis of sequential observa- 
tions. W. G. Cochran—The combination of estimates from different 
experiments. M. Keuls—Testing differences between means in an 
analysis of variance. M. J. R. Healy—Decision between two alterna- 
tives; how many experiments? C. R. Rao—A general theory of diserimi- 
nation when information about alternative population distributions is 
based on samples. By title L. Martin—Suggestions for longitudinal 
data in gerontology. 

3 p.m. Biometry in Immunology. Chairman: G. Rasch for H. C. 
Batson. R. Prigge—Die Anwendung der Mutungsbereiche in der Im- 
munitatsforschung. J. Ipsen—Factors of dosage and host determining 
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antibody response to secondary antigen stimulus. L. B. Holt—Quanti- 
tative studies in diphtheria prophylaxis: an attempt to derive a mathe- 
matical characterization of the antigenicity of diphtheria prophylactics. 
S. Peto—A dose response equation for the invasion of microorganisms. 

9:30 p.m. Exhibits. E. Morice—A statistical study of the phe- 
nomena of human growth. E. Olbrich—Nomographic tables with whole 
numbers and the estimation of errors in anthropological studies. 8. C. 
Pearce and G. H. Freeman—Repeated changing of treatments in trials 
with long-lived species. J. M. Tanner—Size, shape and regional differ- 
ences in the vertebral columns of inbred strains of rabbits, studied by 
covariance. J. Dufrenoy and F. M. Goyan (presented by D. Schwartz)— 
A graphical calculator for statistical analysis. 

September 3rd. 9 a.m. Biometric Methods in Agriculture. Chair- 
man: P. V. Sukhatme. F. Yates—The place of simple experiments on 
cultivators’ fields in agricultural development. V.G. Panse—Principles 
of the survey method of experimentation. T. N. Hoblyn and 8. C. 
Pearce—Some considerations in the design of successive experiments on 
fruit plantations. G. Rasch—On different sources of errors and the 
advantage of their knowledge in planning experiments. 

September 4th. 9 a.m. Functional Relations in Experimentation. 
Chairman: H. Wold. D. J. Finney—Functional relationships in experi- 
mentation. J. Berkson—Minimum chi square and maximum likelihood 
estimates of regression coefficients. 

11 a.m. Mathematical Problems in Genetics—II. Chairman: F. 
Yates. A. R. G. Owen—Experimental designs in genetics. C. A. B. 
Smith—The calculation of correlation between cousins. 

3 p.m. Contributed Papers. Chairman: M. J. R. Healy. A. F. 
Parker-Rhodes—Estimating populations of irregularly observable or- 
ganisms. D. W. Goodall—Factor analysis in plant sociology. G. 
Karreman—The mathematical biology threshold and related phenom- 
ena me excitation. M. Fréchet—Réhabilitation de la notion statisique 
de homme moyen. G. Teissier—Sur la determination de l’axe d’un 
nauge rectelyne de points. (read by G. Darmois). G. Karreman—The 
mathematical biology of threshold and related phenomena in excitation. 
M. W. Bentzon—On the statistical evaluation of dose response curves 
in case the dose intervals are large. 

September 5th. 9a.m. Industrial A 


licati A 
man: A. Linder. ppications of Biometry. Chair 


cevencem BK. A. G. Knowles—Applications of experimental de- 
signs in industry. D. R. Read—The design of chemical experiments. — 
H. C. Hamaker—Experimental designs in industry: a discussion. 


12 noon. Closing remarks by C. Barigozzi, Vice President of ‘The 
Biometric Society for the Italian region. ° 


THIRD INTERNATIONAL BIOMETRIC CONFERENCE 


or 
iw) 
“I 


REGISTRATION 


The 125 participants in the Conference represented 24 different 
countries, with 101 of them members of The Biometric Society when 
the Directory went to press. Twelve were delegates representing 
governments, governmental departments, international organizations, 
educational institutions or academies of science. These individuals are 
starred in the following list of participants, arranged by countries: 
Argentine—J. R. Hérler;* Austria—E. Olbrich; Belgium—R. van den 
Driessche,* A. Lenger, L. Martin,* A. H. L. Rotti; Brazil—A. Grosz- 
mann; Canada—J. W. Hopkins*; Denmark—M. W. Bentzon, N. 
Gjeddebaeck, A. Hald*, G. Rasch; France—J. Arnoux, G. Darmois, 
D. Dugué, J. M. Faverge, R. Feron, M. Frechet, M. Lamotte, M. J. 
Laurent-Duhamel, E. Morice, B. Pélegrin, D. Schwartz, J. Ulmo, 
L. A. Vessereau; Germany—F. Bernstein, F. J. Geks, M. P. Geppert, 
J. Hartung, S. Koller, R. Prigge, H. Prébstel, D. Wichmann; Gold 
Coast—D. W. Goodall; Greece—B. G. Christidis; India—P. C. Maha- 
lanobis, V. G. Panse, C. R. Rao, P. V. Sukhatme; Italy—E. Baldacci, 
G. Barbensi, C. Barigozzi, L. Boretti, A. Buzzati-Traverso, L. L. 
Cavalli-Sforza, R. Ciferri, G. De Angeli, F. Frassetto, T. Gelsomini, 
V. Nozzolini, A. Palazzi, A. Previtera, R. Scossiroli, F. Sella; Malaya— 
D. R. Westgarth; Mexico—A. M. Flores*; Netherlands—D. van 
Dantzig*, E. F. Drion, J. D. Erlee, H. C. Hamaker, G. Hamming, 
J. Hemelrijk, G. van Iterson*, M. Keuls, E. van der Laan, C. A. G. 
Nass, D. J. Stoker; Norway—N. A. Barricelli, @. Nissen, P. Ottestad; 
Portugal—A. Tovar de Lemos; Spain—S. Rios Garcia*; Sweden— 
H. Bergstrém, N. Blomqvist, H. Wold; Switzerland—C. Blanc*, 
A. Kaelin, A. Linder*, E. M. Lourie; Turkey—O. Diizgiines; United 
Kingdom—F. J. Anscombe, M. S. Bartlett, R. E. Blackith, R. O. 
Cashen, O. L. Davies, E. C. Fieller, D. J. Finney, Sir Ronald Fisher, 
M. J. R. Healy, A. Bradford Hill, 8. B. Holt, L. B. Holt, J. O. Irwin, 
E. A. G. Knowles, F. B. Leech, K. Mather, J. A. Nelder, A. R. G. Owen, 
A. F. Parker-Rhodes, S. C. Pearce, S. Peto, D. R. Read, M. R. Sampford, 
C. A. B. Smith, J. M. Tanner, J. Wishart, F. Yates; United States of 
America—J. Berkson, C. A. Bicking, C. I. Bliss, W. G. Cochran, 
G. M. Cox*, B. B. Day, H. Hotelling, J. Ipsen, Jr., G. Karreman, 
H. W. Kloepfer, K. Kopf, I. M. Lerner, D. C. Lowry, E. Lukaes, J. L. 


Lush, H. H. Smith; Uruguay—G. J. Fischer. 


QUERIES 


GrorcEe W. SNEpEcoR, Editor 


QUERY: It is required to find whether or not y depends on the 
104 = variables x, , x. , x3, x,andaz;. One expects to find a relationship 

for y depending only on those factors to each of which can be 
attributed a significant proportion of the variability of the y’s when those 
factors are considered together. For example, table (1) shows that y 
depends on x; and z, , and also on x; when x, and x, are taken into 
account. If one decides that x; and x, have no effect on y then, from 
table (2), x; also does not affect y. Table (3) then shows that only 2, 
accounts for a significant proportion of the variability in the y’s. The 
final analysis is given in table (4). This assumes that the regression on 
vz, %3, t, and x; is zero. Omitting the “non-significant” variables will 
then lead to a biased estimate of the regression on x, and the analysis of 
variance tables show that the estimate of error is inflated. With a larger 
number of observations the error would not be inflated to the same ex- 
tent. The initial hypothesis determines the method of analysis. In the 
example quoted the procedure adopted is suspect but would it be justified 
if there were considerably more observations? 


(1) 
Ct S.S. m.s. 

Regression on x, oe 1 1286277** 

se “2 after x, . 1 28891* 

4 ; Cele Nery 1 3935 

a Wii Pe 1 964 

‘ is wen aiee ee Ee eee 1 25784* 
Remainder . . 8 35659 4457.4 
Total 13 223860 
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Regression on 2, 


Remainder . 


Total 


Regression on 2; 
ins ce 
V2 


Remainder . 


Total 


Regression on 2; 
Remainder . 


ean cee teem 0. aw) 6 


(2) 


ro after x, . 


Uy, , Vo 
vy ) Sp) » vs 
U1, U2, 2X5 » u3 


(3) 


abter-t, - 


(4) 


Ola 


8.8. 
| 128627*** 
l 28891* 
1 17842 
| 795 
i 12046 
8 35659 
13 223860 
df. 8.8. 
il 128627*** 
i 28891 
11 66342 
13 223860 
ats 8.8. 
1 128627** 
12 95233 
13 223860 
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m.s. 


4457.4 


6031.1 


7936.1 


Your procedure is incorrect because only one of the five 


ANSWER: 


mean squares each with one degree of freedom in Table I 


provides a test of significance of a particular aspect of the 
null hypothesis. The model in the present situation is 


y=at Bix, + Bot, + Bst3 + But, + BsXs +, 


the e’s having the usual Gaussian properties. Your Table I provides one 
exact test of significance, namely of the null hypothesis that 6; is zero, 
for which F with 1 and 8 degrees of freedom is 25784/4457.4. We con- 


clude then that y depends on 2; . 
We can derive from your Table I other valid tests of significance. 
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r inst ‘ i i lysis of variance: 
For instance, we can obtain the following analys 


Due to cf. | S.s0s; M. Sq. 
Regression on tp eel. «2s. 0 age 1 128627 
Regression on 2 , %3 , U4 , X; after taking 
SOCOUILY OF Lien) Coe doa: a see ee 59574 14,893.5 
Remamcetae, Siler vas eee? eins 0 ae 35659 4,457.5 
ster Mae Mel tuete Geo’ Bal ae OR ee LS 223860 


This gives a combined test of significance of 8. , 83 , 8s and 8;. The F 
ratio is 3.34, whereas the 5% point of the F distribution is 3.84. Accord- 
ing to this test then we conclude that y does not depend on x, , 23 , x, and 
zs. Wecould also construct e.g. an analysis for testing 2, and x; jointly. 

The analysis of variance given above cannot be used to test the sig- 
nificance of the dependence of y on x, . To make this test we must con- 
struct the analysis of variance: 


Due to df. S. Sqs. M. Sq. 
egressIOH Oly tame tga Li cet. ae ae (not given) 
Regression on 2, after taking account of 
Mon Vaya s Pee ee tae 2 by enberacuon) M 
Remainder... ..... SE) c tae: 35659 4457.4 
FL OUA lem Oc tes tone we ee ae eg Ae es: 


The test consists of comparing M/4457.4 with the F distribution with 1 
and 8 df. If the verdict is significant, then pro tempore on the basis of 
the test and the previous one, we would state that y depends only on 2, . 
The possible bias in a predictor based on x, only can be written down 
from the formula given below. 

The reader will have already noticed one inconsistency in the con- 
clusions that are drawn, namely the test described first indicates that 
y depends on «; , while the third test indicates that y does not depend on 
x3. To elucidate this matter further, it is necessary to note that the 
three tests given above are really giving evidence on the following: 


first test: does use of x; add precision to our prediction of y based on 


X, , U2, 3 and x, ? 
second test: does use of x, Mr eon and Ke 
of y based on x, alone? 
third test: does use of x, add precision to our 
Lo, X3 , tq and a; ? 


s add precision to our prediction 


prediction of y based on 
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Tests like the first and the third can be made by the analysis of variance 
as indicated above, or can be made by t-tests. 

The ¢-test procedure was given by Fisher (J. Roy. Stat. Soe., 85, 611) 
and is mentioned in Statistical Methods for Research Workers, Section 
29. Separate t-tests (or F tests) on single independent variables are cor- 
related because the same estimate of error is used in all. The joint test, 
exemplified by the test of x. , x3 , x, and x; above, removes this correla- 
tion effect. This test was devised many years ago and is the one used 
for example in the analysis of covariance. A description can be found in 
Kempthorne’s Design and Analysis of Experiments, pp. 44-47, for 
example. 

The inconsistency noted above is the usual one which arises when 
several contrasts are tested jointly. One contrast may be significant 
judged alone, but its magnitude becomes diluted by other contrasts of 
small magnitude in a joint test. 

However, another type of inconsistency is of frequent occurrence, 
which may be exemplified as follows. It may happen that in the obser- 
vational set up, as specified by the array of values for x, , 2 , 73 , v4 and 
x; , the values of some of the independent variables, say x, and 2; are 
highly correlated. As a result Fisher’s test will state that x; is of no 
value, which really means that x; adds no precision to the prediction of y 
given that the predictor contains x, , X2 , x3 , and 2, ; also the test on x, 
will yield non significance, meaning that x, adds no precision to the pre- 
diction of y given that the predictor contains x, , x. , x; and x; . However, 
when one makes the joint test of x, and x; , which really asks whether x, 
and x; add precision to the prediction of y, given that the predictor con- 
tains 2, , 2 , and x; , we may find that x, and x; jointly do add precision. 
This difficulty can be bypassed in experimental situations by judicious 
selection of the values of 2; , 2 , 73 , , and z; for which y is observed, i.e. 
by aiming at orthogonality. 

In the absence of a priori knowledge, it is therefore necessary to make 
a sequence of significance tests and the interpretation to be made of suc- 
cessive members in the sequence is not clear cut. Nor is it at all clear 
how one should overcome the difficulty of almost complete nonortho- 
gonality. Certainly it seems reasonable first to test each regression 
coefficient separately by the Fisher test. Then one might omit the inde- 
pendent variable which is least significant among the non-significant 
variables, and start again as though this omitted variable had not been 
observed. More optimum procedures may well be available and the 
present author would be interested to hear of them. A difficult situation 
is examined from a point of view much like the present one by Fisher 


(Proc. Roy. Soc. B, 126, 25-29). 
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The procedure suggested will lead to a conclusion to which exact 
probabilities cannot be attached. However, it is reasonable to take the 
point of view that the conclusion reached by this method, which is 
objective even though its operating characteristics are not known, should. 
be regarded as tentative and the validity of the omission of the eliminated 
variables from the predictor should be tested on a new set of data from 
the same source. Perhaps the best that can be done, given a set of data, 
is to divide the data into two sets at random using one set to estimate the 
dependency relation and the other set to test the validity of this discov- 
ered relation. : 

It is perhaps worth commenting that the problem can perhaps be 
viewed as one of the choice of a predictor of y and then the standard 
facets of decision theory, costs, risks etc. would have to be incorporated. 

Also it is worth adding that some other procedure perhaps based on 
principal components would have to be used if we have a large number 
of independent variables. 

The situation is simpler if we have a priort knowledge. For example, 
we may have good reason to believe that y depends on 2, , x2 and x; and 
does not depend on x, and z;. We may then use the general test pro- 
cedure described above to test the accuracy of our a priort knowledge. 
The analysis for this test in the present situation is as follows: 


Due to d.f. 8. Sqs. M. 8q. 


Regression on 2 , 2,23 . ee Se 
Regression on x, , x; after taking account 
of t1,%a,%z .-.......=.~. 2 (by subtraction) M 
Femamcderaun cee Sheer *) US 35659 4457.4 
otal aie: cp eee tc i vey Boys 223860 


Sa c 
F= 44574 with 2 and 8 d.f. 


iven if we accept the validity of our prior knowledge it is probably wise 
to use as error mean square that based on deviations from the regression 
of y on all variables to insure that it is not biased. 

The bias In regression coefficients estimated after the elimination of 
some variables may be written down easily (see Kempthorne p. 59). 


Denote the full model by 
y= xX fe oe X56 ae 


where y, Xi , X2, y, 6, e are matrices, y, y, 6, and e being column ma- 


trices, y and 6 being the arra; i i 
t ys of regression coefficients. Then i 
in y resulting from assuming 6 to be zero is equal to sh ae 
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CXEX 4) XIXGS 
where X; is the transpose of X, , and 6 is the true value of 6, which is 
erroneously taken to be zero. For example if we ought to be fitting 


y = Bz, + Bx, 
but in fact fit 


y = Bix, 


DIEXZa 
es 
It is also worth noting that in writing down a model such as 
Y = Bo + Brty+ Bor. + *:* + Bsr 


one has already possibly introduced bias by omitting variables neces- 
sarily because they were unobserved. 


the bias in By is 


O. KEMPTHORNE 


QUERY: JT have a split-split plot experiment with 4 replications. 
105 Ihave been unable to find a formula for the standard error of the 

difference between means of treatments at different levels of the 
splits. Can you help me? 


Assume we have r replications, a whole-plot treatments, 6 
ANSWER: _ sub-plot treatments and y sub-sub-plot treatments; the 
error mean squares are #7, for the whole-plots, /, for the 
sub-plots and £, for the sub-sub-plots. The estimated standard error of 
the difference between the mean yields of treatments (a boc.) and 


(a,b,¢;) is 
= [B. + (8 — DE, + Bly - DE.) 
oy, 


The same result holds for any comparison with different whole-plot 
treatments, e.g. (doboCo) vs. either (a:b,¢9) or (a:b0¢:) or (a1b0Co). 

However, if you want to compare (a boc.) with either (@b;c1) or 
(aob:¢), the estimated standard error is 


2 
ry [E, Sf (y ah 1)E.], 


and if you change only the sub-sub-plot treatment, e.g. (@oboco) vs. 
(aoboc:), the estimated standard error is 


V 2E./r 
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I would like to emphasize that these are average standard errors and 
are not intended to be used to compare means selected on the basis of the 
results in the given experiment. For example, you should not use these 
standard errors to compare the two most divergent means in this experi- 


ment. 
R. L. ANDERSON 


CORRECTION FOR QUERY 100, Vou. 9 (1953), PAGE 253. 


There was a misunderstanding between Dr. Irwin and me concerning 
the presence or absence of interaction. The method furnished by Dr. 
Irwin gives unbiased estimates, though not the best ones, if interaction 
is absent; it is the appropriate method to use if interaction is assumed 
present as in table 11.24 of my fourth edition. In fact, on the assumption 
of interaction in the population, this method is the one actually applied 
in table 11.25. 

But under the assumption of no interaction in the population, the 
correct method is the one given in table 11.22, with no alteration. This 
means that the method described in Query 100 was inadvertently applied 
to table 11.22. As the printers would say of this table, “Stet.” 

Dr. Irwin and I join in regret that our misunderstanding was not dis- 
covered before the query was printed. 


GEORGE W. SNEDECOR 


THE BIOMETRIC SOCIETY 


1953 Directory. Each member of the Society should have received 
his free copy of our new 1953 Directory before this number of Biometrics 
reaches him. It lists all members of the Society as of December 31, 1952, 
plus all members enrolled between then and July, 1953, when the Direc- 
tory went to press. Our first Directory in 1949 listed 902 members and 
the new Directory, four years later, 1152 members, representing a net 
increase of 25 per cent even though 40 per cent of those listed in 1949 
no longer belong to the Society. Geographically, our members live in 
fifty different countries, with no country having an absolute majority. 
Growth has been most notable in Belgium, Germany, Japan, and on the 
African continent. In addition to the alphabetical membership list 
and a geographical summary, the Directory includes a list of the general 
and regional officers of the Society since its founding, the Society 
Constitution and the Council By-Laws. Additional copies of the Direc- 
tory may be purchased through the Office of the Secretary-Treasurer in 
New Haven. 


I.U.B.S. The International Union of Biological Sciences held its 
Eleventh General Assembly in Nice, France on August 17-21, with 
President H. Munro Fox, Professor of Zoology in the University of 
London, presiding. The Biometric Society is one of the nine sections of 
the I.U.B.S., the other sections being Botany, Cytology, Embryology, 
Entomology, Genetics, Limnology, Microbiology and Zoology. The 
Section of Biometry was represented by C. I. Bliss, Sir Ronald Fisher, 
M. Lamotte and A. Linder. The opening address by President Fox was 
followed by reports for the period 1950-53 by the Secretary General of 
the Union, Professor P. Vayssiere, by a representative of each section, 
sub-section, commission and joint commission, and by the Treasure: of 
the Union, Professor F. Chodat. These reports, including that on Biom- 
etry by the Secretary of the Society, will be published in the Proceedings 
of the Assembly. 

New regulations for the Union were adopted and activities for the 
period 1953-56 reviewed. Among the symposia accepted by the Assem- 
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bly was a proposal by the Section of Biometry for a biometric symposium 
in Brazil in July, 1955. Another symposium in Brazil in 1955, on ento- 
mology, was also approved. Since the International Statistical Institute 
will hold its 29th session in Brazil in the summer of 1955, the biometric 
program will be coordinated with these allied international meetings to 
their mutual advantage. Plans were proposed for enlarging the activities 
and income of the Union, so that they would be more nearly commen- 
surate with its scope. The following general officers were elected for the 
period 1953-56: President, S. O. Horstadius, Uppsala; Vice-President, 
R. E. Cleland, Washington, D.C.; Secretary General, G. Montalenti, 
Naples; Secretary, R. Ulrich, Paris; Treasurer, A. Linder, Geneva. 
Drs. Montalenti and Linder are both members of The Biometric Society. 


I.S.I. The 28th Session of the International Statistical Institute, 
with which the Society is affiliated, convened in Rome at F.A.O. head- 
quarters on September 6-12, with many members of the Society in 
attendance. The Session will be remembered not only for its scientific 
programs, but also for the magnificence of the entertainment provided 
by the Organizing Committee, and the interest in the Institute shown by 
our hosts. During the week participants in the Session were welcomed 
by Prime Minister Pella, addressed in a special audience by Pope Pius 
XII, and received at the Quirinale Palace by President Einaudi, himself 
a member of the Institute for twenty-five years. Following the Session 
the Institute sponsored a Seminar on September 14-17 as part of its 
statistical education program. Many of the lecturers on mathematical 


and industrial statistics and on sample surveys were members of the 
Society. 


Région Frangaise. A la séance de la Région, le 20 Mai, 1953, & Paris, 
le Professeur G. Malecot a parlé sur “Migration et évolution des popu- 
lations”, et M. M. Lamotte sur ‘Essai d’emploi de quelques méthodes 
mathématiques en génétique des populations (fin)’’. 


di apanese Section. The first meeting was held on August 8, 1953 at 
the National Institute of Agricultural Sciences, Tokyo, the aclond sti 
being about fifteen. Various business matters were discussed. It was 
decided to hold at least two general meetings a year, and special meetings 
in Tokyo about four times a year or as required. The spring general 
meeting will be held in connection with the general meeting of the 
Japanese Society of Agronomy. The autumn general meeting will be 
held in connection with that of the Japanese Society of Mathematical 
Statistics. It was proposed that a summary of the biometric studies 
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conducted in Japan during the year should be presented at one of the 
general meetings. 


German Section. The German members of the Society held their 
first session at the Paul Ehrlich Institute in Frankfurt am Main on Sep- 
tember 22. Dr. Maria-Pia Geppert, National Secretary for Germany, 
opened the afternoon business meeting with a report on the Bellagio 
Conference. Society activities in Germany were discussed, including 
arrangements for the publication of biometric articles in Germany, a 
listing of members, both present (now numbering twenty-five) and pro- 
spective, and plans for the next meeting of the section in January. In 
the evening members of the Society, meeting jointly with the Biolo- 
gischer Verein c:V., were addressed by Professor A. Linder of the Uni- 
versity of Geneva on ‘“‘Experimental Design” and by Dr. C. I. Bliss of 
New Haven, Conn. on ‘““Two Examples of Covariance”, with more than 
seventy-five in attendance. An active discussion continued until after 
midnight. 


Netherlands Section. Members of The Biometric Society in the Neth- 
erlands met jointly with the Medical-biological Section of the Nether- 
lands Statistical Society and the Biometric Section of the Netherlands 
Society for Agricultural Science at the University of Utrecht on Sep- 
tember 25, with thirty-six attending. In the morning C. 1. Bliss de- 
scribed the role of covariance in solving a medical and an agricultural 
problem, and in the afternoon P. L. F. de Jong discussed the prediction 
of the size of the strawberry crop in the Netherlands in 1953 and 1954 
from the results of previous years, weather records and economic factors. 
The meeting closed with reports on the Third International Biometric 
Conference in Bellagio by E. van der Laan and on the 28th session of the 
International Statistical Institute in Rome by E. F. Drion. 


Belgian Region. A invitation du Professeur P. Spehl, Monsieur le 
Professeur C. I. Bliss, Secrétaire de la ‘“Biometric Society”, a donné une 
conférence trés remarquée a la tribune de la Société Adolphe Quetelet, 
Association des Biométriciens de Belgique et du Congo Belge. Cette 
séance s’est tenu dans les locaux de la Faculté de Médecine de |’Uni- 
versité de Bruxelles, le 29 septembre dernier. La conférence de Monsieur 
Bliss, qui fut suivie d’une discussion, portait sur le sujet suivant: ‘ ane 
solution of a medical and of an agricultural experiment with covariance” 
Une trentaine de membres de la Société Adolphe Quetelet ont pial 


le conférencier. 
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Cooperative Graduate Summer Sessions in Statistics 


Beginning in 1954 North Carolina State College, the University of 
Florida, Virginia Polytechnic Institute and the Southern Regional Edu- 
cation Board will jointly sponsor cooperative Graduate Summer Sessions 
in Statistics. 

The first session will be conducted by a distinguished faculty at 
Virginia Polytechnic Institute in the summer of 1954. Additional 
summer sessions are tentatively planned for North Carolina State 
College and the University of Florida in the two following years. Subse- 
quent sessions will be rotated among these or other institutions through- 
out the South. 

The summer sessions are designed to carry out a recommendation of 
the Southern Regional Education Board’s Commission on Statistics, on 
which the three institutions initiating the program are represented. 
They will be of particular interest to (1) research and professional 
workers who want intensive instruction in basic statistical concepts and 
who wish to learn modern statistical methodology; (2) teachers of 
elementary statistical courses who want some formal training in modern 
statistics; (3) prospective candidates for graduate degrees in statistics; 
(4) graduate students in other fields who desire supporting work in 
statistics; and (5) professional statisticians who wish to keep informed 
of advanced specialized theory and methods. 

Kach of the summer sessions will last six weeks and each course will 
carry three semester hours of graduate credit, with a maximum of six 
semester hour credits earned in one summer. The courses are arranged 
to enable the person to take consecutive work in successive summers. 
The summer work in statistics may be applied at any one of the cooperat- 
ing institutions in partial fulfillment of the requirements for a Master’s 
degree. The catalog requirements for the degree must be met at the 
degree-granting institutions. Kach Doctoral candidate should consult 
with the institution from which he desires to obtain the degree regarding 
the applicability of the summer courses in statistics. 

During the first session Professor Maurice Kendall of the University 
of London will give a course in Multi-variate Analysis, and Dr. Ralph 


Comstock of North Carolina State College will give one in Quantitative 
Genetics. . - 
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The staff of the Virginia Polytechnic Institute’s Department of 
Statistics will offer such courses as Prob: ability and Inference, Analysis 
of Variance, Statistical Methods, Engineering Statistics, Education 
~ Statistics, Rank Order Statistics and the Theory of Sequential Methods. 

The department includes R. A. Bradley, D. B. Duncan, M. C. K. 
Tweedie, P. M. Somerville and Boyd Harshbarger. In addition, other 
outstanding statistical scholars will direct special afternoon seminars. 
The agricultural, science and engineering divisions of the College will 
make available advanced courses for students who wish to supplement 
their work in statistics. 

The Virginia Polytechnic Institute is located at Blacksburg in the 
scenic Allegheny Mountains. The summer climate is delightful. 

The fee for the Virginia Polytechnic Institute session is $30.00. 
Board, room, post office box and laundry for the entire session may be 
had for $76.40. The session will run from June 9 through July 17, 1954. 

Inquiries should be addressed to Boyd Harshbarger, Head, Depart- 
ment of Statistics, Virgmia Polytechnic Institute, Blacksburg, Virginia. 


Special Statistics Session for Research Engineers, Physicists and Chemists 


During the Spring quarter of 1954 (March 24 to June 4) the Institute 
of Statistics of the University of North Carolina will sponsor a special 
program of course work, lectures and seminars on statistics for research 
engineers, physicists and chemists. The primary objective of this 
program is to provide an opportunity for industrial research workers 
to acquire a working knowledge of modern statistical concepts and 
techniques. Emphasis will be on the efficient design of experiments 
and the analysis of data therefrom. Informal siminars on statistical 
problems submitted by the participating students will be held. Guest 
lecturers will include Dr. W. J. Youden and Dr. M. G. Kendall. Regular 
college credit will be granted for course work satisfactorily completed. 
For further information write to Institute of Statistics, North Carolina 
State College, Box 5457, Raleigh, N. C. 


Meeting of The Biometric Society, Gainesville, Florida 


Titles and abstracts for contributed papers for The Biometric 
Society meeting to be held at the University of Florida, March 17, 18, 
19 and 20, 1954, should be sent to Boyd Harshbarger, Department of 
Statistics, Virginia Polytechnic Institute, Blacksburg, Virginia, not 


later than February 20, 1954. 
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